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NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS PNEUMONIAE 

The present invention relates to proteins derived from Streptococcus pneumoniae, 
nucleic acid molecules encoding such proteins » the use of the nucleic acid and/or 
proteins as antigens/immunogens and in detection/diagnosis, as well as methods for 
screening the proteins/nucleic acid sequences as potential anti-microbial targets. 

Streptococcus pneumoniae, commonly referred to as the pneumococcus, is an 
important pathogenic organism. The continuing significance of Streptoccocus 
pneumoniae infections in relation to human disease in developing and developed 
countries has been authoritatively reviewed (Fiber, G.R., Science, 265: 1385-1387 
(1994)). That indicates that on a global scale this organism is believed to be the 
most common bacterial cause of acute respiratory infections, and is estimated to 
result in 1 million childhood deaths each year, mostly in developing countries 
(Stansfield, S.K., Pediatr. Infect. Dis,, 6: 622 (1987)). In the USA it has been 
suggested (Breiman et al. Arch. Intern. Med., 150: 1401 (1990)) that the 
pneumococcus is still the most common cause of bacterial pneumonia, and that 
disease rates are particularly high in young children, in the elderly, and in patients 
with predisposing conditions such as asplenia, heart, lung and kidney disease, 
diabetes, alcoholism, or with immunosupressive disorders, especially AIDS. These 
groups are at higher risk of pneumococcal septicaemia and hence meningitis and 
therefore have a greater risk of dying from pneumococcal infection. The 
pneumococcus is also the leading cause of otitis media and sinusitis, which remain 
prevalent infections in children in developed countries, and which incur substantial 
costs. 

The need for effective preventative strategies against pneumococcal infection is 
highlighted by the recent emergence of penicillin-resistant pneumococci. It has been 
reported that 6.6% of pneumoccal isolates in 13 US hospitals in 12 states were found 
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to be resistant to penicillin and some isolates were also resistant to other antibiotics 
including third generation cyclosporins (Schappert, S.M., Vital and Health Statistics 
of the Centres for Disease Control/National Centre for Health Statistics, 214:1 
(1992)). The rates of penicillin resistance can be higher (up to 20%) in some 
5 hospitals (Breiman et al, J. Am. Med. Assoc., 271: 1831 (1994)). Since the 
development of penicillin resistance among pneumococci is both recent and sudden, 
coming after decades during which penicillin remained an effective treatment, these 
findings are regarded as alarming. 

10 For the reasons given above, there are therefore compelling grounds for considering 
improvements in the means of preventing, controlling, diagnosing or treating 
pneumococcal diseases. 

Various approaches have been taken in order to provide vaccines for the prevention 
15 of pneumococcal infections. Difficulties arise for instance in view of the variety of 
serotypes (at least 90) based on the structure of the polysaccharide capsule 
surrounding the organism. Vaccines against individual serotypes are not effective 
against other serotypes and this means that vaccines must include polysaccharide 
antigens from a whole range of serotypes in order to be effective in a majority of 
20 cases. An additional problem arises because it has been found that the capsular 

polysaccharides (each of which determines the serotype and is the major protective 
antigen) when purified and used as a vaccine do not reliably induce protective 
antibody responses in children under two years of age, the age group which suffers 
the highest incidence of invasive pneumococcal infection and meningitis. 

25 

A modification of the approach using capsule antigens relies on conjugating the 
polysaccharide to a protein in order to derive an enhanced immune response, 
particularly by giving the response T-cell dependent character. This approach has 
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been used in the development of a vaccine against Haemophilus influenzae, for 
instance. There are, however, issues of cost concerning both the muhi- 
polysaccharide vaccines and those based on conjugates. 

5 A third approach is to look for other antigenic components which offer the potential 
to be vaccine candidates. This is the basis of the present invention. Using a specially 
developed bacterial expression system, we have been able to identify a group of 
protein antigens from pneomococcus which are associated with the bacterial 
envelope or which are secreted. 

10 

Thus, in a first aspect the present invention provides a Streptococcus pneumoniae 
protein or polypeptide having a sequence selected from those shown in table 1. 

In a second aspect, the present invention provides a Streptococcus pneumoniae 
15 protein or polypeptide having a sequence selected from those shown in table 2. 

A protein or polypeptide of the present mvention may be provided in substantially pure 
form. For example, it may be provided in a form which is substantially free of other 
proteins. 

20 

As discussed herein, the proteins and polypeptides of the invention are useful as 
antigenic material. Such material can be "antigenic" and/or "immunogenic". 
Generally, "antigenic" is taken to mean that the protein or polypeptide is capable of 
being used to raise antibodies or indeed is capable of inducing an antibody response in 
25 a subject. "Immunogenic" is taken to mean that the protein or polypeptide is capable of 
eliciting a protective immune response in a subject. Thus, in the latter case, the protein 
or polypeptide may be capable of not only generating an antibody response but. in 
addition, a non-antibody based immune response. 
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The skilled person will appreciate that homologues or derivatives of the proteins or 
polypeptides of the invention will also find use in the context of the present invention, 
ie as antigenic/immunogenic material. Thus, for instance proteins or polypeptides 
which include one or more additions, deletions, substitutions or the like are 
encompassed by the present invention. In addition, it may be possible to replace one 
amino acid witii another of similar "type". For instance replacing one hydrophobic 
amino acid with another. 

One can use a program such as the CLUSTAL program to compare amino acid 
sequences. This program compares amino acid sequences and finds the optimal 
alignment by inserting spaces in either sequence as appropriate. It is possible to 
calculate amino acid identity or similarity (identity plus conservation of amino acid 
type) for an optimal alignment. A program like BLASTx will align the longest stretch 
of similar sequences and assign a value to the fit. It is thus possible to obtain a 
comparison where several regions of similarity are found, each having a different 
score. Both types of identity analysis are contemplated in the present invention. 

In the case of homologues and derivatives, the degree of identity with a protein or 
polypeptide as described herein is less important than that the homologue or derivative 
should retain the antigenicity or immunogenicity of the origmal protein or polypeptide. 
However, suitably, homologues or derivatives having at least 60% similarity (as 
discussed above) with the proteins or polypeptides described herein are provided. 
Preferably, homologues or derivatives having at least 70% similarity, more preferably 
at least 80% similarity are provided. Most preferably, homologues or derivatives 
having at least 90% or even 95% similarity are provided. 

In an alternative approach, the homologues or derivatives could be fusion proteins, 
incorporating moieties which render purification easier, for example by effectively 
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tagging the desired protein or polypeptide. It may be necessary to remove the "tag" or ' 
it may be the case that the fusion protein itself retains sufficient antigenicity to be 
useftil. 

In an additional aspect of the invention there are provided antigenic/immunogenic 
fragments of the proteins or polypeptides of the invention, or of homologues or 
derivatives thereof. 

For fragments of the proteins or polypeptides described herein, or of homologues or 
derivatives thereof, the situation is slightly different. It is well known that is possible to 
screen an antigenic protein or polypeptide to identify epitopic regions, ie those regions 
which are responsible for the protein or polypeptide's antigenicity or immunogenicity. 
Methods for carrying out such screening are well known in the art. Thus, the fragments 
of the present invention should include one or more such epitopic regions or be 
sufficiently similar to such regions to retain their antigenic/immunogenic properties. 
Thus, for fragments according to the present invention the degree of identity is perhaps 
irrelevant, since they may be 100% identical to a particular part of a protem or 
polypeptide, homologue or derivative as described herein. The key issue, once again, is 
that the fragment retains the antigenic/immunogenic properties. 

Thus, what is important for homologues, derivatives and fragments is that they possess 
at least a degree of the antigenicity/immunogenicity of the protein or polypeptide from 
which they are derived. 

Gene cloning techniques may be used to provide a protein of the invention in 
substantially pure form. These techniques are disclosed, for example, in J. Sambrook 
et al Molecular Cloning 2nd Edition, Cold Sprmg Harbor Laboratory Press (1989). 
Thus, in a third aspect, the present invention provides a nucleic acid molecule 
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comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

In a fourth aspect the present invention provides a nucleic acid molecule comprising or 
consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); or 
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(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 

The nucleic acid molecules of the invention may include a plurality of such sequences, 
and/or fragments. The skilled person will appreciate that the present invention can 
include novel variants of those particular novel nucleic acid molecules which are 
exemplified herein. Such variants are encompassed by the present invention. These 
may occur in nature, for example because of strain variation. For example, additions, 
substitutions and/or deletions are included. In addition, and particularly when utilising 
microbial expression systems, one may wish to engmeer the nucleic acid sequence by 
making use of known preferred codon usage in the particular organism being used for 
expression. Thus, syndietic or non-naturally occurring variants are also included within 
the scope of the invention. 

The term "RNA equivalent" when used above indicates that a given RNA molecule has 
a sequence which is complementary to that of a given DNA molecule (allowing for the 
fact that in RNA "U" replaces "T" in the genetic code). 

When comparing nucleic acid sequences for the purposes of determining the degree of 
homology or identity one can use programs such as BESTFIT and GAP (botii from the 
Wisconsin Genetics Computer Group (GCG) software package) BESTFIT, for 
example, compares two sequences and produces an optimal alignment of the most 
sunilar segments. GAP enables sequences to be aligned along their whole length and 
finds the optimal alignment by inserting spaces in either sequence as appropriate. 
Suitably, in the context of the present invention when discussing identity of nucleic acid 
sequences, the comparison is made by alignment of the sequences along their whole 
length. 
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Preferably, sequences which have substantial identity have at least 50% sequence 
identity, desirably at least 75% sequence identity and more desirably at least 90 or at 
least 95% sequence identity with said sequences. In some cases the sequence identity 
may be 99% or above. 

Desirably, the term "substantial identity" indicates that said sequence has a greater 
degree of identity with any of the sequences described herem than with prior art nucleic 
acid sequences. 

It should however be noted that where a nucleic acid sequence of the present invention 
codes for at least part of a novel gene product the present invention includes within its 
scope all possible sequence coding for the gene product or for a novel part diereof. 

The nucleic acid molecule may be in isolated or recombinant form. It may be 
incorporated into a vector and the vector may be incorporated into a host. Such vectors 
and suitable hosts form yet further aspects of the present invention. 

Therefore, for example, by using probes based upon the nucleic acid sequences 
provided herein, genes in Streptococcus pneumoniae can be identified. They can then 
be excised using restriction enzymes and cloned mto a vector. The vector can be 
introduced into a suitable host for expression. 

Nucleic acid molecules of the present invention may be obtained from S.pneumoniae by 
the use of appropriate probes complementary to part of the sequences of ttie nucleic 
acid molecules. Restriction enzymes or sonication techniques can be used to obtain 
appropriately sized fragments for probing. 
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Alternatively PGR techniques may be used to amplify a desired nucleic acid sequence. 
Thus the sequence data provided herein can be used to design two primers for use in 
PGR so that a desired sequence, includmg whole genes or fragments thereof, can be 
targeted and then amplified to a high degree. 

Typically primers will be at least 15-25 nucleotides long. 

As a further alternative chemical synthesis may be used. This may be automated. 
Relatively short sequences may be chemically synthesised and ligated together to 
provide a longer sequence. 

There is another group of proteins from S. pneumoniae which have been identified 
using the bacterial expression system described herein. These are known proteins 
from S, pneumoniae, which have not previously been identified as antigenic proteins. 
The amino acid sequences of this group of proteins, together with DNA sequences 
coding for them are shown in Table 3. These proteins, or homologues, derivatives 
and/or fragments thereof also find use as antigens/immunogens. Thus, in another 
aspect the present invention provides the use of a protein or polypeptide having a 
sequence selected from those shown in Tables 1-3, or homologues, derivatives 
and/or fragments thereof, as an immunogen/antigen. 

In yet a further aspect the present invention provides an immunogenic/antigenic 
composition comprising one or more proteins or polypeptides selected from those 
whose sequences are shown in Tables 1-3, or homologues or derivatives thereof, 
and/or fragments of any of these. In preferred embodiments, the 
immunogenic/antigenic composition is a vaccine or is for use in a diagnostic assay. 

In the case of vaccines suitable additional excipients, diluents, adjuvants or the like 
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may be included. Numerous examples of these are well known in the art. 

It is also possible to utilise the nucleic acid sequences shown in Tables 1-3 in the 
preparation of so-called DNA vaccines. Thus, the invention also provides a vaccine 
5 composition comprising one or more nucleic acid sequences as defined herein, DNA 
vaccines are described in the art (see for instance, Donnelly et al , Ann. Rev, 
Immunol, 15:617-648 (1997)) and the skilled person can use such art described 
techniques to produce and use DNA vaccines according to the present invention. 

10 As already discussed herein the proteins or polypeptides described herein, their 

homologues or derivatives, and/or fragments of any of these, can be used in methods 
of detecting/diagnosing S. pneumoniae. Such methods can be based on the detection 
of antibodies against such proteins which may be present in a subject. Therefore the 
present invention provides a method for the detection/diagnosis of S.pneumoniae 

15 which comprises the step of bringing into contact a sample to be tested with at least 
one protein, or homologue, derivative or fragment thereof, as described herein. 
Suitably, the sample is a biological sample, such as a tissue sample or a sample of 
blood or saliva obtained from a subject to be tested. 

20 In an alternative approach, the proteins described herein, or homologues, derivatives 
and/or fragments thereof, can be used to raise antibodies, which in turn can be used 
to detect the antigens, and hence S.pneumoniae. Such antibodies form another aspect 
of the invention. Antibodies within the scope of the present invention may be 
monoclonal or polyclonal. 

25 

Polyclonal antibodies can be raised by stimulating their production in a suitable animal 
host (e.g. a mouse, rat, guinea pig, rabbit, sheep, goat or monkey) when a protein as 
described herein, or a homologue, derivative or fragment thereof, is injected into the 
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animal. If desired, an adjuvant may be administered together with the protein. Well- 
known adjuvants include Freund's adjuvant (complete and incomplete) and aluminium 
hydroxide. The antibodies can then be purified by virtue of their binding to a protein as 
described herein. 

Monoclonal antibodies can be produced from hybridomas. These can be formed by 
fusing myeloma cells and spleen cells which produce the desired antibody in order to 
form an immortal cell line. Thus the well-known Kohler & Milstein technique {Nature 
256 (1975)) or subsequent variations upon this technique can be used. 

Techniques for producing monoclonal and polyclonal antibodies that bind to a 
particular polypeptide/protein are now well developed in the art. They are discussed in 
standard immunology textbooks, for example in Roitt et al. Immunology second edition 
(1989), Churchill Livingstone, London. 

In addition to whole antibodies, the present invention includes derivatives thereof which 
are capable of binding to proteins etc as described herein. Thus the present invention 
includes antibody fragments and synthetic constructs. Examples of antibody fragments 
and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 
1994). 

Antibody fragments include, for example, Fab, F(ab')2 and Fv fragments. Fab 
fragments (These are discussed in Roitt et al [supra] ). Fv fragments can be modified 
to produce a synthetic construct known as a single chain Fv (scFv) molecule. This 
includes a peptide linker covalently joining Vj, and V, regions, which contributes to the 
stability of the molecule. Other synthetic constructs that can be used include CDR 
peptides. These are synthetic peptides comprising antigen-binding determinants. 
Peptide mimetics may also be used. These molecules are usually conformationally 
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restricted organic rings that mimic the structure of a CDR loop and that include 
antigen-interactive side chains. 

Synthetic constructs include chimaeric molecules. Thus, for example, humanised (or 
primatised) antibodies or derivatives thereof are within the scope of the present 
invention. An example of a humanised antibody is an antibody having human 
framework regions, but rodent hypervariable regions. Ways of producing chimaeric 
antibodies are discussed for example by Morrison et al in PNAS, 81, 6851-6855 (1984) 
and by Takeda et al in Nature. 314, 452-454 (1985). 

Synthetic constructs also include molecules comprising an additional moiety that 
provides the molecule with some desirable property in addition to antigen binding. For 
example the moiety may be a label (e.g. a fluorescent or radioactive label). 
Alternatively, it may be a pharmaceutically active agent. 

Antibodies, or derivatives thereof, find use in detection/diagnosis of S,pneumoniae, 
Thus, in another aspect the present invention provides a method for the 
detection/diagnosis of S,pneumoniae which comprises the step of bringing into contact 
a sample to be tested and antibodies capable of binding to one or more proteins 
described herein, or to homologues, derivatives and/or fragments thereof. 

In addition, so-called "Affibodies" may be utilised. These are binding proteins 
selected fi*om combinatorial libraries of an alpha-helical bacterial receptor domain 
(Nord etal,) Thus, Small protein domains, capable of specific binding to different 
target proteins can be selected using combinatorial approaches. 

It will also be clear that the nucleic acid sequences described herein may be used to 
detect/diagnose S. pneumoniae. Thus, in yet a further aspect, the present invention 
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provides a method for the detection/diagnosis of S.pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
sequence as described herein. Suitably, the sample is a biological sample, such as a 
tissue sample or a sample of blood or saliva obtained from a subject to be tested. 
Such samples may be pre-treated before being used in the methods of the invention. 
Thus, for example, a sample may be treated to extract DNA. Then, DNA probes 
based on the nucleic acid sequences described herein (ie usually fragments of such 
sequences) may be used to detect nucleic acid from S, pneumoniae. 

In additional aspects, the present invention provides: 

(a) a method of vaccinating a subject against pneumoniae which comprises the 
step of administermg to a subject a protem or polypeptide of the invention, or a 
derivative, homologue or fragment thereof, or an immunogenic composition of the 
invention; 

(b) a method of vaccinating a subject against S, pneumoniae which comprises the 
step of administering to a subject a nucleic acid molecule as defined herein; 

(c) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a protein or polypeptide of the 
invention, or a derivative, homologue or fragment thereof, or an immunogenic 
composition of the invention; 

(d) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a nucleic acid molecule as defined 
herein; 
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(e) a kit for use in detecting/diagnosing S.pneumoniae infection comprising one 
or more proteins or polypeptides of the invention, or homologues, derivatives or 
fragments thereof, or an antigenic composition of the invention; and 

5 (f) a kit for use in detecting/diagnosing S,pneumoniae infection comprising one 
or more nucleic acid molecules as defined herein. 

Given that we have identified a group of important proteins, such proteins are 
potential targets for anti-microbial therapy. It is necessary, however, to determine 
10 whether each individual protein is essential for the organism's viability. Thus, the 
present invention also provides a method of determining whether a protein or 
polypeptide as described herein represents a potential anti-microbial target which 
comprises antagonising, inhibiting or otherwise interfering with the function or 
expression of said protein and determining whether S. pneumoniae is still viable. 

15 

A suitable method for inactivating the protein is to effect selected gene knockouts, ie 
prevent expression of the protein and determine whether this results in a lethal 
change. Suitable methods for carrying out such gene knockouts are described in Li 
etal, PMA.S., 94:13251-13256 (1997) and Kolkman et al , 178:3736- 
20 3741 (1996). 

In a final aspect the present invention provides the use of an agent capable of 
antagonising, inhibiting or otherwise interfering with the function or expression of a 
protein or polypeptide of the invention in the manufacture of a medicament for use in 
25 the treatment or prophylaxis of S. pneumoniae infection. 

As mentioned above, we have used a bacterial expression system as a means of 
identifying those proteins which are surface associated, secreted or exported and 
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thus, would find use as antigens. 

The information necessary for the secretion/export of proteins has been extensively 
studied in bacteria. In the majority of cases, protein export requires a signal peptide 
5 to be present at the N-terminus of the precursor protein so that it becomes directed to 
the translocation machinery on the cytoplasmic membrane. During or after 
translocation, the signal peptide is removed by a membrane associated signal 
peptidase. Ultimately the localization of the protein (i.e. whether it be secreted, an 
integral membrane protein or attached to the cell wall) is determined by sequences 
10 other than the leader peptide itself. 

We are specifically interested in surface located or exported proteins as these are 
likely to be antigens for use in vaccines, as diagnostic reagents or as targets for 
tiierapy with novel chemical entities. We have therefore developed a screening 

15 vector-system in Lactococcus lactis that permits genes encoding exported proteins to 
be identified and isolated. We provide below a representative example showing how 
given novel surface associated proteins from Streptococcus pneumoniae have been 
identified and characterized. The screening vector incorporates the staphylococcal 
nuclease gene nuc lacking its own export signal as a secretion reporter. 

20 Staphylococcal nuclease is a naturally secreted heat-stable, monomeric enzyme 
which has been efficiently expressed and secreted in a range of Gram positive 
bacteria (Shortle, Gene, 22:181-189 (1983); Kovacevic et al, 7. BacterioL, 
162:521-528 (1985); Miller etaL, 7. BacterioL, 169:3508-3514 (1987); Liebl etai, 
y. BacterioL, 174:1854-1861 (1992); Le Loir a/., 7. BacterioL, 176:5135-5139 

25 (1994); Poquet et al, J, BacterioL, 180:1904-1912 (1998)). 

Recently, Poquet et aL ((1998), supra) have described a screening vector 
incorporating the nuc gene lacking its own signal leader as a reporter to identify 
exported proteins in Gram positive bacteria, and have applied it to L, laais. This 
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vector (pFUN) contains the pAMfJl replicon which functions in a broad host range 
of Gram-positive bacteria in addition to the ColEl replicon that promotes replication 
in Escherichia coll and certain other Gram negative bacteria. Unique cloning sites 
present in the vector can be used to generate transcriptional and translational fusions 
5 between cloned genomic DN A fragments and the open reading frame of the 

truncated nuc gene devoid of its own signal secretion leader. The nuc gene makes an 
ideal reporter gene because the secretion of nuclease can readily be detected using a 
simple and sensitive plate test: Recombinant colonies secreting the nuclease develop 
a pink halo whereas control colonies remain white (Shortle, (1983), supra; Le Loir 
10 etal, (1994), supra). 

Thus, the invention will now be described with reference to the following 
representative example, which provides details of how the proteins, polypeptides and 
nucleic acid sequences described herein identified as antigenic targets. 

15 

We describe herein the construction of three reporter vectors and their use in L. 
lactis to identify and isolate genomic DNA fragments from Streptococcus 
pneumoniae encoding secreted or surface associated proteins. 

The invention will now be described with reference to the examples, which should 
20 not be construed as in any way limiting the invention. The examples refer to the 
figures in which: 

Figure 1: shows the results of a number of DNA vaccine trials; and 
25 Figure 2: shows the results of further DNA vaccine trials. 

EXAMPLE 1 

(i) Construction of the pTREPl-nuc series of reporter vectors 

30 
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(a) Construction of expression plasmid pTREPl 

The pTREPI plasmid is a high-copy number (40-80 per cell) theta-replicating gram 
positive plasmid, which is a derivative of the pTREX plasmid which is itself a 
derivative of the previously published pIL253 plasmid. pIL253 incorporates the 
broad Gram-positive host range replicon of pAMpi (Simon and Chopin, Biochimie, 
70:559-567 (1988)) and is non-mobilisable by the Llactis sex-factor. pIL253 also 
lacks the tra function which is necessary for transfer or efficient mobilisation by 
conjugative parent plasmids exemplified by pIL501. The Enterococcal pAMpi 
replicon has previously been transferred to various species including Streptococcus, 
Lactobacillus and Bacillus species as well as Clostridium acetobutylicum, (Oultram 
and Klaenhammer, FEMS Microbiological Letters, 27:129-134 (1985); Gibson et 
aLy (1979); LeBlanc et al.. Proceedings of the National Academy of Science USA, 
75:3484-3487 (1978)) indicating the potential broad host range utility. The pTREPI 
plasmid represents a constitutive transcription vector. 

The pTREX vector was constructed as follows. An artificial DNA fragment 
containing a putative RNA stabilising sequence, a translation initiation region (TIR), 
a multiple cloning site for insertion of the target genes and a transcription terminator 
was created by annealing 2 complementary oligonucleotides and extending with Tfl 
DNA polymerase. The sense and anti-sense oligonucleotides contained the 
recognition sites for Nhel and BamHI at their 5' ends respectively to facilitate 
cloning. This fragment was cloned between the Xbal and BamHI sites in 
pUC19NT7, a derivative of pUC19 which contains the T7 expression cassette from 
pLETl (Wells et al , J. AppL BacterioL, 74:629-636 (1993)) cloned between the 
EcoRI and Hindlll sites. The resultmg construct was designated pUCLEX. The 
complete expression cassette of pUCLEX was then removed by cutting with HmdIII 
and blunting followed by cutting with EcoRI before cloning into EcoRI and Sad 
(blunted) sites of pIL253 to generate the vector pTREX (Wells and Schofield, In 
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Current advances in metabolism, genetics and applications-NATO ASI Series, H 
98:37-62 (1996)). The putative RNA stabilising sequence and TIR are derived from 
the Escherichia coli T7 bacteriophage sequence and modified at one nucleotide 
position to enhance the complementarity of the Shine Dalgarno (SD) motif to the 
5 ribosomal 16s RNA of Lactococcus lactis (Schofield et aL pers. corns. University of 
Cambridge Dept. Pathology.). 

A Lactococcus lactis MG1363 chromosomal DNA fragment exhibiting promoter 
activity which was subsequently designated P7 was cloned between the EcoRI and 
10 Bglll sites present in the expression cassette, creating pTREX7. This active 
promoter region had been previously isolated using the promoter probe vector 
pSB292 (Waterfield et al. Gene, 165:9-15 (1995)). The promoter fragment was 
amplified by PGR using the Vent DNA polymerase according to the manufacturer. 

15 The pTREPl vector was then constructed as follows. An artificial DNA fragment 
which included a transcription terminator, the forward pUC sequencing primer, a 
promoter multiple -cloning site region and a universal translation stop sequence was 
created by annealing two overlapping partially complementary synthetic 
oligonucleotides together and extending with sequenase accordmg to manufacturers 

20 instructions. The sense and anti-sense (pTREPp and pTREPR) oligonucleotides 

contained the recognition sites for EcoRV and BamHI at their 5' ends respectively to 
facilitate cloning into pTREX7. The transcription terminator was that of the Bacillus 
penicillinase gene, which has been shown to be effective in Lactococcus (Jos et aL, 
Applied and Environmental Microbiology, 50:540-542 (1985)). This was considered 

25 necessary as expression of target genes in the pTREX vectors was observed to be 

leaky and is thought to be the resuh of cryptic promoter activity in the origin region 
(Schofield et aL pers. coms. University of Cambridge Dept. Pathology.). The 
forward pUC primer sequencing was included to enable direct sequencing of cloned 
DNA fragments. The translation stop sequence which encodes a stop codon in 3 
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different frames was included to prevent translational fusions between vector genes 
and cloned DNA fragments. The pTREX7 vector was first digested with EcoRI and 
blunted using the 5' - 3' polymerase activity of T4 DNA polymerase (NEB) 
according to manufacturer's instructions. The EcoRI digested and blunt ended 
pTREX7 vector was then digested with Bgl II thus removing the P7 promoter. The 
artificial DNA fragment derived from the annealed synthetic oligonucleotides was 
then digested with EcoRV and Bam HI and cloned into the EcoRI(blunted)-Bgl II 
digested pTREX7 vector to generate pTREP. A Lactococcus lactis MG1363 
chromosomal promoter designated PI was then cloned between the EcoRI and Bglll 
sites present in the pTREP expression cassette forming pTREPl. This promoter was 
also isolated using the promoter probe vector pSB292 and characterised by 
Waterfield et aL, (1995), supra. The PI promoter fragment was originally 
amplified by PCR using vent DNA polymerase according to manufacturers 
instructions and cloned into the pTREX as an EcoRI-Bglll DNA fi-agment. The 
EcoRI-Bglll PI promoter containing fi-agment was removed from pTREXl by 
restriction enzyme digestion and used for cloning into pTREP (Schofield et al pers. 
coms. University of Cambridge, Dept. Pathology.). 

(b) PCR amplification of the 5. aureus nuc gene . 

The nucleotide sequence of the 5. aureus nuc gene (EMBL database accession 
number V01281) was used to design synthetic oligonucleotide primers for PCR 
amplification. The primers were designed to amplify the mature form of the nuc 
gene designated nucA which is generated by proteolytic cleavage of the N-terminal 
19 to 21 amino acids of the secreted propeptide designated Snase B (Shortle, (1983), 
supra). Three sense primers (nucSl, nucS2 and nucS3, Appendix 1) were designed, 
each one having a blunt-ended restriction endonuclease cleavage site for EcoRV or 
Smal in a different reading frame with respect to the nuc gene. Additionally Bglll 
and BamHI were incorporated at the 5* ends of the sense and anti-sense primers 
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respectively to facilitate cloning into BamHl and Bglll cut pTREPl. The sequences 
of all the primers are given in Appendix 1. Three nuc gene DNA fragments 
encoding the mature form of the nuclease gene (NucA) were amplified by PCR using 
each of the sense primers combined with the anti-sense primer described above. The 
5 nuc gene fragments were amplified by PCR using 5. aureus genomic DNA template, 
Vent DNA Polymerase (NEB) and the conditions recommended by the 

manufacturer. An initial denaturation step at 93 for 2 min was followed by 30 

cycles of denaturation at 93 for 45 sec, annealing at 50 for 45 seconds, and 

extension at 73 for 1 minute and then a final 5 min extension step at 73 ^C. The 
10 PCR amplified products were purified using a Wizard clean up colunm (Promega) to 
remove unincorporated nucleotides and primers. 

(c) Construction of the pTREPl-nuc vectors 

15 The purified nuc gene fragments described in section b were digested with Bgl II and 
BamHI using standard conditions and ligated to BamHI and Bglll cut and 
dephosphorylated pTREPI to generate the pTREPl-nucl, pTREPl-nuc2 and 
pTREPl-nuc3 series of reporter vectors. General molecular biology techniques were 
carried out using the reagents and buffer supplied by the manufacture or using 

20 standard conditions(Sambrook and Maniatis, (1989), supra). In each of the pTREPl- 
nuc vectors the expression cassette comprises a transcription terminator, lactococcal 
promoter PI, unique cloning sites (Bglll, EcoRV or Smal) followed by the mature 
form of the nuc gene and a second transcription terminator. Note that the sequences 
required for translation and secretion of the nuc gene were deliberately excluded in 

25 this construction. Such elements can only be provided by appropriately digested 

foreign DNA fragments (representing the target bacterium) which can be cloned into 
the unique restriction sites present immediately upstream of the nuc gene. 
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In possessing a promoter, the pTREPl-nuc vectors differ from the pFUN vector 
described by Poquet et al (1998), supra, which was used to identify L. lactis 
exported proteins by screening directly for Nuc activity directly in L. lactis. As the 
pFUN vector does not contain a promoter upstream of the mc open reading frame 
the cloned genomic DNA fragment must also provide the signals for transcription in 
addition to those elements required for translation initiation and secretion of Nuc. 
This limitation may prevent the isolation of genes that are distant from a promoter 
for example genes which are within polycistronic operons. Additionally there can be 
no guarantee that promoters derived from other species of bacteria will be 
recognised and functional in L. lactis. Certain promoters may be under stringent 
regulation in the natural host but not in L. lactis. In contrast, the presence of the PI 
promoter in die pTREPl-nuc series of vectors ensures that promoterless DNA 
fragments (or DNA fragments containing promoter sequences not active in L. lactis) 
will still be transcribed. 

(d) Screening for secreted proteins in S. pneumoniae 

Genomic DNA isolated from S, pneumoniae was digested with the restriction 
enzyme Tru9I. This enzyme which recognises the sequence 5' - TTAA -3' was used 
because it cuts A/T rich genomes efficiently and can generate random genomic 
DNA fragments within the preferred size range (usually averaging 0.5 - 1.0 kb). 
This size range was preferred because there is an increased probability that the PI 
promoter can be utilised to transcribe a novel gene sequence. However, the PI 
promoter may not be necessary in all cases as it is possible that many Streptococcal 
promoters are recognised in L. lactis. DNA fragments of different size ranges were 
purified from partial Tru9I digests of 5. pneumoniae genomic DNA. As the Tru 91 
restriction enzyme generates staggered ends the DNA fragments had to be made 
blunt ended before ligation to the EcoRV or Smal cut pTREPl-nuc vectors. This 
was achieved by the partial fill-in enzyme reaction using the 5 '-3' polymerase 
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activity of Klenow enzyme. Briefly Tni9I digested DNA was dissolved in a solution 
(usually between 10-20 in total) supplemented with T4 DNA ligase buffer (New 
England Biolabs; NEB) (IX) and 33 /iM of each of the required dNTPs, in this case 
dATP and dTTP. Klenow enzyme was added (1 unit Klenow enzyme (NEB) per ^g 
5 of DNA) and the reaction incubated at 25*^0 for 15 minutes. The reaction was 
stopped by incubating the mix at TS'^C for 20 minutes. EcoRV or Smal digested 
pTREP-nuc plasmid DNA was then added (usually between 200-400 ng). The mix 
was then supplemented with 400 units of T4 DNA ligase (NEB) and T4 DNA ligase 
buffer (IX) and incubated overnight at 16**C. The ligation mix was precipitated 
10 directly in 100% Ethanol and 1/10 volume of 3M sodium acetate (pH 5.2) and used 
to transform L. lactis MG1363 (Gasson, 1983). Alternatively, the gene cloning site 
of the pTREP-nuc vectors also contains a Bglll site which can be used to clone for 
example Sau3AI digested genomic DNA fragments. 

L. lactis transformant colonies were grown on brain heart infusion agar and nuclease 
15 secreting (Nuc+) clones were detected by a toluidine blue-DNA-agar overlay (0.05 
M Tris pH 9.0, 10 g of agar per litre, 10 g of NaCl per liter, 0. 1 mM CaC12, 0.03 % 
wt/vol. salmon sperm DNA and 90 mg of Toluidine blue O dye) essentially as 
described by Shorde, 1983, supra and Le Loir et al., 1994, supra). The plates were 
then incubated at 37*^C for up to 2 hours. Nuclease secreting clones develop an 

20 easily identifiable pink halo. Plasmid DNA was isolated from Nuc"^ recombinant L. 
lactis clones and DNA inserts were sequenced on one strand using the NucSeq 
sequencing primer described in Appendix 1, which sequences directly through the 
DNA insert. 

25 Isolation of Genes Encoding Exported Proteins from 
5. pneumoniae 

A large number of gene sequences putatively encoding exported proteins in 5. 
pneumoniae have been identified using the nuclease screening system. These have 
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now been further analysed to remove artefacts. The sequences identified using the 
screening system have been analysed using a number of parameters. 

L All putative surface proteins were analysed for leader/signal peptide 
sequences using the software programs Sequencher (Gene Codes Corporation) and 
DNA Strider (Marck, Nucleic Acids Res,, 16:1829-1836 (1988)). Bacterial signal 
peptide sequences share a conunon design. They are characterised by a short 
positively charged N-terminus (N region) immediately preceding a stretch of 
hydrophobic residues (central portion-h region) followed by a more polar C-terminal 
portion which contains the cleavage site (c-region). Computer software is available 
which allows hydropathy profiling of putative proteins and which can readily 
identify the very distinctive hydrophobic portion (h-region) typical of leader peptide 
sequences. In addition, the sequences were checked for the presence of or absence of 
a potential ribosomal binding site (Shine-Dalgarno motif) required for translation 
initiation of the putative nuc reporter fusion protein. 

2. All putative surface protein sequences were also matched with all of the 
protein/DNA sequences using the publicly databases [OWL-proteins inclusive of 
SwissProt and GenBank translations]. This allows us to identify sequences similar to 
known genes or homologues of genes for which some ftmction has been ascribed. 
Hence it has been possible to predict a fiinction for some of the genes identified 
using the LEEP system and to unequivocally establish that the system can be used to 
identify and isolate gene sequences of surface associated proteins. We should also be 
able to confirm that these proteins are indeed surface related and not artifacts. The 
LEEP system has been used to identify novel gene targets for vaccine and therapy. 

3. Some of the genes identified proteins did not possess a typical leader 
peptide sequence and did not show homology with any DNA/protein sequences in 
the database. Indeed these proteins may indicate the primary advantage of our 
screening method, i.e. the isolation of atypical surface-related proteins, which may 
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have been missed in all previously described screening protocols or approaches 
based on sequence homology searches. 

In all cases, only partial gene sequences were initially obtained. Full length genes 
were obtained in all cases by reference to the TIGR S.pneumoniae database 
( www@tigr.orgV Thus, by matching the originally obtained partial sequences with 
the database, we were able to identify the full length gene sequences. In this way, as 
described herein, three groups of genes were clearly identified, ie a group of genes 
encoding previously unidentified S. pneumoniae proteins, a second group exhibiting 
some homology with known proteins from a variety of sources and a third group 
which encoded known pneumoniae proteins, which were, however, not known as 
antigens. 

Example 2: Vaccine trials 
pcDNA3,l+ as a DNA vaccine vector 
DcDNA3,l + 

The vector chosen for use as a DNA vaccine vector was pcDNA3.1 (Invitrogen) 
(actually pcDNA3.1 + , the forward orientation was used in all cases but may be 
referred to as pcDNA3. 1 here on). This vector has been widely and successfully 
employed as a host vector to test vaccine candidate genes to give protection against 
pathogens in the literature (Zhang, et al, Kurar and Splitter, Anderson et al). The 
vector was designed for high-level stable and non-replicative transient expression in 
mammalian cells. pcDNA3.1 contains the ColEl origin of replication which allows 
convenient high-copy number replication and growth in £. coli. This in turn allows 
rapid and efficient cloning and testing of many genes. The pcDNA3. 1 vector has a 
large number of cloning sites and also contains the gene encoding ampicillin 
resistance to aid in cloning selection and the human cytomegalovirus (CMV) 
immediate-early promoter/enhancer which permits efficient, high-level expression of 
the recombinant protein. The CMV promoter is a strong viral promoter in a wide 
range of cell types including both muscle and immune (antigen presenting) cells. 
This is important for optimal immune response as it remains unknown as to which 
cells types are most important in generating a protective response in vivo, A T7 
promoter upstream of the multiple cloning site affords efficient expression of the 
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modified insert of interest and which allows in vitro transcription of a cloned gene in 
the sense orientation. 

Zhang, D., Yang, X., Berry, J. Shen, C, McClarty, G. and Brunham. R.C. (1997) 
5 "DNA vaccination with the major outer-membrane protein genes induces acquired 
immunity to Chlamydia trachomatis (mouse pneumonitis) infection". Infection and 
Immunity, 176, 1035-40. 

Kurar, E. and Splitter, G.A. (1997) "Nucleic acid vaccination of Brucella abortus 
10 ribosomal L7/L12 gene elicits immune response". Vaccine, 15, 1851-57. 

Anderson, R., Gao, X.-M., Papakonstantinopoulou, A., Roberts, M. and Dougan, 
G. (1996) "Immune response in mice following immunisation with DNA encoding 
fragment C of tetanus toxin". Infection and Immunity, 64, 3168-3173. 

15 

Preparation of DNA vaccines 

Oligonucleotide primers were designed for each individual gene of interest derived 
using the LEEP system. Each gene was examined thoroughly, and where possible, 

20 primers were designed such that they targeted that portion of the gene thought to 
encode only the mature portion of the gene protein. It was hoped that expressing 
those sequences that encode only the mature portion of a target gene protein, would 
facilitate its correct folding when expressed in mammalian cells. For example, in the 
majority of cases primers were designed such that putative N-terminal signal peptide 

25 sequences would not be included in the final amplification product to be cloned into 
the pcDNA3.1 expression vector. The signal peptide directs the polypeptide 
precursor to the cell membrane via the protein export pathway where it is normally 
cleaved off by signal peptidase I (or signal peptidase II if a lipoprotein). Hence the 
signal peptide does not make up any part of the mature protein whether it be 

30 displayed on the surface of the bacteria surface or secreted. Where an N-terminal 
leader peptide sequence was not immediately obvious, primers were designed to 
target die whole of the gene sequence for cloning and ultimately, expression in 
pcDNA3.1. 

35 Having said that, however, other additional features of proteins may also affect the 
expression and presentation of a soluble protein. DNA sequences encoding such 
features in the genes encoding the proteins of interest were excluded during the 
design of oligonucleotides. These features included: 

40 1 . LPXTG cell wall anchoring motifs. 

2. LXXC ipoprotein attachment sites. 

3. Hydrophobic C-terminal domain. 
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4. Where no N-terminal signal peptide or LXXC was present the start codon was 
excluded. 

5, Where no hydrophobic C-terminal domain or LPXTG motif was present the stop 
codon was removed. 

Appropriate PGR primers were designed for each gene of interest and any and all of 
the regions encoding the above features was removed from the gene when designing 
these primers. The primers were designed with the appropriate enzyme restriction 
site followed by a conserved Kozak nucleotide sequence (in most cases(NB except in 
occasional instances for example ID59) GCCACC was used. The Kozak sequence 
facilitates the recognition of initiator sequences by eukaryotic ribosomes) and an 
ATG start codon upstream of the insert of the gene of interest. For example the 
forward primer using a BamHl site the primer would begin 
GCGGGATCCGCCACCATG followed by a small section of the 5' end of the gene 
of interest. The reverse primer was designed to be compatible with the forward 
primer and with a Notl restriction site at the 5' end in most cases (this site is 
TTGCGGCCGC) (NB except in occasional instances for example ID59 where a 
Xhol site was used instead of Notl). 

PGR primers 

The following PGR primers were designed and used to amplify the truncated genes 
of interest, 

ID5 

Forward Primer 5' 

CGG ATCCGGC ACG ATGGGTCT A ATTGA AG AGTT A A AA AATG A A 3 ' 
Reverse Primer 5' TTGGGGGGGGGAATGGTAGAGTAAACACAAGAGTCA 3' 

ID59 

Forward Primer 5' GGCGGATCCATGAAAAAAATCTATTGATTTTTAGGA 3' 
Reverse Primer 5' GCCTGGAGGGCTAGTTGGGATAGATTTTAAAGTGTAGG 
3* 



ID51 

Forward Primer 5' GGGATGCGGCAGGATGAGTGATGTCGGTGGAAATG 3' 
Reverse Primer 5' TTGGGGGCGGATACGAAACGGTGAGATGTAGG 3' 
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ID29 

Forward Primer 5* CGGATCCGCCACCATGCAAAAAGAGCGGTATGGTTATG 
3' 

5 Reverse Primer 5* TTGCGGCCGCACCCCCATTCTTAATCCCTT 3' 
ID50 

Forward Primer 5* 

1 0 CGG ATCCGCC ACC ATGGAGGTATGTGA A ATGTC ACGTA AA 3 ' 

Reverse Primer 5' TTGCGGCCGCTTTTACAAAGTCAAGCAAAGCC 3' 

Cloning 

15 The insert along with the flanking features described above was amplified using PGR 
against a template of genomic DNA isolated from type 4 S, pneumoniae strain 11886 
obtained from the National Collection of Type Cultures. The PGR product was cut 
with the appropriate restriction enzymes and cloned in to the multiple cloning site of 
pcDNA3. 1 using conventional molecular biological techniques. Suitably mapped 

20 clones of the genes of interested were cultured and the plasmids isolated on a large 
scale (> 1.5 mg) using Plasmid Mega Kits (Qiagen). Successful cloning and 
maintenance of genes was confirmed by restriction mapping and sequencing -700 
base pairs through the 5' cloning junction of each large scale preparation of each 
construct. 

25 

Strain validation 

A strain of type 4 was used in cloning and challenge methods which is the strain 
from which die 5. pneumoniae genome was sequenced. A freeze dried ampoule of a 

30 homogeneous laboratory strain of type 4 S, pneumoniae strain NCTC 11886 was 

obtained from the National Collection of Type Strains. The ampoule was opened and 
the cultured re suspended with 0.5 ml of tryptic soy broth (0.5% glucose, 5% 
blood). The suspension was subcultured into 10 ml tryptic soy broth (0.5% glucose, 
5% blood) and incubated statically overnight at 37''C. This culture was streaked on 

35 to 5 % blood agar plates to check for contaminants and confirm viability and on to 
blood agar slopes and the rest of the culture was used to make 20% glycerol stocks. 
The slopes were sent to the Public Health Laboratory Service where the type 4 
serotype was confirmed. 

40 A glycerol stock of NCTC 1 1886 was streaked on a 5 % blood agar plate and 

incubated overnight in a C02 gas jar at 37°C. Fresh streaks were made and optochin 
sensitivity was confirmed. 
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Pneumococcal challenge 

A standard inoculum of type 4 5. pneumoniae was prepared and frozen down by 
5 passaging a culture of pneumococcus Ix through mice, harvesting from the blood of 
infected animals, and grown up to a predetermined viable count of around 10^ 
cfu/ml in broth before freezing down. The preparation is set out below as per the 
flow chart. 

10 Streak pneumococcal culture and confirm identity 

I 

V 

15 Grow over-night culture from 4-5 colonies on plate above 



20 Animal passage pneumococcal culture 

(i.p. injection of cardiac bleed to harvest) 



25 



30 



35 



V 

Grow over-night culture from animal passaged pneumococcus 

I 

V 

Grow day culture (to pre-determined optical density) from over-night of animal 
passage and freeze down at -70°C - This is standard minimum 



Thaw one aliquot of standard inoculum to viable count 



40 



Use standard inoculum to determine effective dose (called Virulence Testing) 
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I 

V 

5 All subsequent challenges - use standard inoculum to effective dose 

An aliquot of standard inoculum was diluted 500x in PBS and used to inoculate the 
mice. 

10 Mice were lightly anaesthetised using halothane and then a dose of 1.4 x 10^ cfii of 
pneumococcus was applied to the nose of each mouse. The uptake was facilitated by 
the normal breathing of the mouse, which was left to recover on its back. 

5. pneumoniae Vaccine trials 

15 

Vaccine trials in mice were carried out by the administration of DNA to 6 week old 
CBA/ca mice (Harlan, UK). Mice to be. vaccinated were divided into groups of six 
and each group was immunised with recombinant pcDNA3.1 + plasmid DNA 
containing a specific target-gene sequence of interest. A total of 100 ftg of DNA in 

20 Dulbeccd's PBS (Sigma) was injected intramuscularly into the tibialis anterior 
muscle of both legs (50 fi\ in each leg). A boost was carried using the same 
procedure 4 weeks later. For comparison, control groups were included in all 
vaccine trials. These control groups were either unvaccinated animals or those 
administered with non-recombinant pcDNAS.lH- DNA (sham vaccinated) only, 

25 using the same time course described above. 3 weeks after the second immunisation, 
all mice groups were challenged intra-nasally with a lethal dose of 5. pneumoniae 
serotype 4 (strain NCTC 11886). The number of bacteria administered was 
monitored by plating serial dilutions of the inoculum on 5% blood agar plates. A 
problem with intranasal immunisations is that in some mice ttie inoculum bubbles out 

30 of the nostrils, this has been noted in results table and taken account of in 

calculations. A less obvious problem is that a certain amount of the inoculum for 
each mouse may be swallowed. It is assumed that this amount will be the same for 
each mouse and will average out over the course of inoculations. However, the 
sample sizes that have been used are small and this problem may have significant 

35 effects in some experiments. All mice remaining after the challenge were killed 3 or 
4 days after infection. During the infection process, challenged mice were monitored 
for the development of symptoms associated with the onset of S, pneumoniae 
induced-disease. Typical symptoms in an appropriate order included piloerection, 
an increasingly hunched posture, discharge from eyes, increased lethargy and 

40 reluctance to move. The latter symptoms usually coincided with the development of 
a moribund state at which stage the mice were culled to prevent further suffering. 
These mice were deemed to be very close to death, and the time of culling was used 
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to determine a survival time for statistical analysis. Where mice were found dead, 
the survival time was taken as the last time point when the mouse was monitored 
alive. 

Interpretation of Results 

A positive result was taken as any DNA sequence that was cloned and used in 
challenge experiments as described above which gave protection against that 
challenge. Protection was taken as those DNA sequences that gave statistically 
significant protection (to a 95% confidence level (p<0.05)) and also those which 
were marginal or close to significant usmg Mann- Whitney or which show some 
protective features for example there were one or more outlying mice or because the 
time to the first death was prolonged. It is acceptable to allow marginal or non- 
significant results to be considered as potential positives when it is considered that 
the clarity of some of the results may be clouded by the problems associated with the 
administration of intranasal infections. 
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p value 2 refers to significance tests compared to pcDNA3.1+ vaccinated controls 
Statistical Analvses. 

Trial 1 - None of the other groups had significantly longer survival times than the 
controls. The survival times of the unvaccinated and pcDNAS. 1 control groups were 
not significantly different. One of the mice from IDS was an outlying resuh and the 
mean survival times for IDS were extended but not significantly so. 
Trial 2 - The group vaccinated with ID59 had significantly longer survival times than 
the unvaccinated control group. 

Trial 5 - The group vaccinated with ID59 again survived for an average of almost 10 
hours longer than the controls but the results were not quite statistically significant. 
Trial 6 - The group vaccinated with ID51 did not have survival times significantly 
higher than unvaccinated controls (p= <36.0), however, there were 2 outlying mice 
in the vaccinated group. 

Vaccine trials 7 and 8 (See figure 2) 



Mouse 
number 


Mean survival times (hours) 


Unvacc 
control (7) 


ro29 (7) 


Unvacc 
control (8) 


ID50 (8) 


1 


59.6 


73.1 


45.1 


60.6 


2 


47.2 


54.8 


50.8 


60.6 


3 


59.6 


59.3 


60.4 


51.1 


4 


70.9 


54.8* 


55.2 


60.6 


5 


68.6* 


59.3 


45.1 


60.6 


6 


76.0 


54.8 


45.1 


60.6 


Mean 


63.6 


59.35 


50.2 


59.1 


sd 


10.3 


7.1 


6.4 


3.9 


p value 1 




<39.0 




0.0048 



* - bubbled when dosed so may not have received full inoculum. 

T - terminated at end of experiment having no symptoms of infection. 
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Numbers in brackets - survival times disregarded assuming incomplete dosing 
p value 1 refers to significance tests compared to unvaccinated controls 

Statistical Analyses. 

5 Trial 7 - The ID29 vaccinated group showed prolonged times to the first death. T 
Trial 8 - The group vaccinated with ID50 survived significantly longer than 
unvaccinated controls. 
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Appendix I - Oligonucleotide primers 
nucSl 

Bglll EcoRV 
5 5 ' - cgagatctgatatctcacaaacagataacggcgtaaatag -3 ' 

nucS2 

Bgl II Sma I 

5'- gaagatcttccccgggatcacaaacagataacggcgtaaatag -3' 

10 

nucS3 

Bgl II EcoRV 
5'- cgagatctgatatccatcacaaacagataacggcgtaaatag -3' 

15 nucR 

Bam HI 

5'- cgggatccttatggacctgaatcagcgttgtc -3' 
NucSeq 

20 5'- ggatgctttgtttcaggtgtatc -3' 
pTREPp 

5 ' - catgatatcggtacctcaagctcatatcattgtccggcaatggtgtgggctttttttgttttagcggataa 
caatttcacac -3' 

25 

pTREPR 

5 ' - gcggatcccccgggcttaattaatgtttaaacactagtcgaagatctcgcgaattctcctgtgtgaaatt 
gttatccgcta -3' 

30 pUCp 

5'- cgccagggttttcccagtcacgac -3' 

vr 

5'- tcaggggggcggagcctatg -3* 

35 

Vl 

5'- tcgtatgttgtgtggaattgtg -3' 
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V2 

5'- tccggctcgtatgttgtgtggaattg -3' 
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TABLE 1 



ID4 1200 bo 

5 

ATGAGAAATATGTGGGTTGTAATCAAGGAAACCrATCTTCGACATGTCGAGTCATGGAGTi-lCriCl-riATGGTGA 
TTTCGCCGTTCCTCTTTTTAGGAATCTCTGTAGGAATTGGGCATCTCCAAGGTTC^ 

AGTGGCAGTAGTGACAACAGTGCCATCTGTAGCAGAAGGACTGAAGAATGTAAATGGTGTTAACTTCGACTATAA 
AGACGAAGCAAGTGCCAAAGAAGCAATTAAAGAAGAAAAATTAAAAGGTTATTTGACCATTGATCAAGAAGATA 

lU GTGTTCrAAAGGCAGTTTATCATGGCGAAACATCGCTTGAAAATGGAATTAAATTTGAGGTTACAGGTACACTCA 
ATGAACTGCAAAATCAGCTTAATCGTTCAACTGCTTCCTTGTCTCAAGAGCAGGAAAAACGCTTAGCGCAGACAA 
TTCAATTC ACAG AAAAGATTGATGAAGCCAAGGAAAATAAAAAGTTTATTCAAACAATTGCAGCAGGTGCCTTAG 
GATTCTIUCrin'AT ATGATTC TGATTACCTATGCGGGTGTAACAGCTCAGGAAG'TTGCCAGTGAAAAAGGCACCAA 
A ATTA TGGAAGTCGTTTTrrCTAGCATAAGGGCAAGTCACTATTTCTATGCGCGGATGATGGCT 

1 0 ArmA ACGCATWTTGGG ATCTATGTTGTAGGTGGTCTGGCTGCCGTTTTGCT 

TCAGTCTGGTATTTTGGATCACTTGGGAGATGCTATCTCACrGAATACCTTGCTCm 
TGTAC GTAG TCTTGGCAGCCT TCCT AGGATCTATGGTTTCTCGTCCTGAGGACTCAGGGAAAGCCT^ 
GATGATTTTGATTATGGGTGGTTTTTTTGGAGTGACAGCT 
GGTTCTTATATTCCCTTTATTTCGACCnTCTTTATGCCGTTTC^ 

ZU CATGGATTTCACTTGCTATTACAGTGATTTTTGCGGTGGTAGCAACAGGATTTATCGGACGC^^ 
CGTTCTTCAAACGGATGATTTAGGGATTTGGAAAACCTTTAAACGTGCCTTATOT 

MRNMWVVKETYLRHVESWSFFFMVISPFLFLGISVGIGHLQGSSMAKNNKVAVVTTVPSVAEGLKNVNGVNFDYKD 

EASAKEAIKEEKLKGYLTIDQEDSVLKAVYHGETSLENGIKFEVTGTLNELQNQLNRSTASLSQEQEKRLAQTIQFTEKI 

25 DEAKENKKFIQTIAAGALGFFLYMILITYAGVTAQEVASEKGTKIMEVVFSSIRASHYFYARMMALFLVILTHIGIYVVG 

GlAAVLLFKDLPFLAQSGILDHUjDAISLhTTLLHLISLFMYVVLAAFLGSMVSRPEDSGKALSPLMILIMGGFFGVTALG 

AAGDNLLLKIGSYIPnSTFFMPFRTINDYAGGAEAWISLAITVIFAVVATGFlGRMYASLVLQTDDLGIWKTFKRALSYK 
Z 

30 IDS 1125 bp 

cctgggaaagtcttgaaaattatgatagaatggtggaaggaaaaattcaggagagtagtagtgactcaaaatgtt 
gaaagtcttctcgtatccattgtaatcagtgcatacaatgaagaaaaatatctgcctggtctaattgaagacttaa 
aaaatcaaacctatcctaaagaggatattgaaattctatttataaatgctatgtccacagatgggaccacagcta 
3j tcattca gcaa tttataaaggaagatacagagtttaactcaattagattgtataacaatcctaagaaaaatcaag 

CTAGT GGTTT TAACCTGGGAGTTAAACATTCTGTAGGGGACCTTATTITAAAAATTGATGCTC^ 

tgagacttttgtaatgaacaatgtggctattattcaacaaggtgaatttgtctgtggggggcctagaccgacgatt 
gtcgaaggaaaaggaaaatgggcagagaccttgcatcttgttgaggaaaatatgtttggcagtagcattgccaat 

TATCGAAATAGTTCTGAGGATAGATATGTTTCTTCTATTTTTCATGGAATGTATAAACGAGAGGT^ 
4U TTGGTTTAGTAAATGAGCAACTTGGCCGAACTGAAGATAATGATATTCATTATAGAATTCGAGAATATGGTTATAA 
AATCCGCTATAGCCCAAGTATTCTATCTTATCAGTATATTCGACCAACATTCAAGAAAATGCTGCATCAAAAG^^^ 
TCAAATGGTTTGTGGATTGGCTTGACAAGTCATGTTCAGTTTAAGTGTTTATCATTATTTCACTAT^ 
A TTTG TTTT GAGT CrTGTGTTTAGTCTAGCATTGTTACCGATCACATTCGTATTCATAACm 
ATTTTCTACTTTTGTCATTACTCACTTTGCTGACTTTATTA^ 
43 ATTTTATTTrCCATTCACTTTGCTTATGGCCTTGGGACGATTGTAGGTTTAAT^ 

AGTACAAGAGAACAATAATTTATTTGGATAAAATAAGCCAAATAAATCAAAATATGCTATAA 



50 



55 



PGKVLKIMIEWKEKFRRVVVTQNVESLLVSIVISAYNEEKYLPGLIEDLKNQTYPKEDIEILnNAMSTDGTTAlIQQFIK 
EDTEFNSIRLYNNPKKNQASGFNLGVKHSVGDLILKIDAHSKVTETFVMNNVAIIQQGEFVCGGPRPTIVEGKGKWAET 
LHLVEENMFGSSIANYRNSSEDRYVSSIFHGMYKREVFQKVGLVNEQLGRTEDNDIHYRIREYGYKIRYSPSILSYQYIRP 
TFKKMLHQKYSNGLWIGLTSHVQFKCLSLFHYVPCLFVLSLVFSLALLPITFVFITLLLGAYFLLLSLLTLLTLLKHKNGF 
LIVMPFILFSIHFAYGLGTIVGLIRGFKWKKEYKRTIIYLDKISQINQNMLZ 

IDll 696 hp 



ATTTTAATAGTGGCACITGTGACAGGTGCGGGGGCTTTTGCATATAGCACr^ 

GTACCACGCGAATTTACGTAGTGAATCGCAATCAAGGAGACAAGCCGGGGTTGACAAATCAGGATTTGCAGGCAG 
GAACTTATCTGGTAAAAGACTACCGTGAGATTATCCTTTCGCAGGATGTTTTGGAGGAAGTTGm 
OU ACTAGATTTGACGCCAAAAGGTrrGGCTAATAAAATTAAAGTGACAGTACCAGTTGATACCCGTATTGTCTCTATT 
TCAGTTAATGATCGAGTTCCTGAAGAGGCAAGCCGTATCGCTAACTCTTTGAGAGAAGTAGCTGCTCAAAAAATT 
ATCAGTATTACTCGTGTTTCTGA CGTGAC AACACTGGAGGAGGCAAGGCCGGCGATATCCCCGTCTTCGCCAAAT 
ATTAAACGCAATACACTAATTGGTTTTTTGGCAGGGGTGATTGGAACTAGTGTTATAGTTCTTCATm 
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GGATACTCGTGTGAAACGTCCGGAAGATATCGAAAATACATTGCAGATGACACnTTTGGGAGTTGTGCCAAACTT 
GGGTAAGTTGAAATAG 

MMKEQm-IEIDVFQLVKSLWKRKLMIUVALVTGAGAFAYSTFIVKPEYTSTTRIYVVNRNQGDKPGLTNQDU}^^^ 
5 VKDYREnLSQDVLEEWSDLKLDLTPKGLANKIKVTVPVDTRIVSISVNDRVPEEASRIANSLREVAAQKIISrrRVSDVT 
TLEEARPAISPSSPNIKRNTLIGFLAGVIGTSVIVLHLELLDTRVKRPEDIEhTTLQMTLLGVVPNLGKLKZ 

10 ATGGTAAAAGTAGCAGTTATATTAGCTCAGGGCTTTGAAGAAATTGAAGCCTTGACAGTTGTAGATGTCT^ 

GAGCCAATATCACATGTGATATGGTTGGTTTTGAAGAGCAAGTAACGGGTTCGCATGCAATCCAAGTAAGAGCAG 
ATCATGTCmGATGGAGATTTATCAGACTATGATATGATTGTTCTTCCTGGAGGTATGCCTGGTTCT^ 
CGTGATAATCAGACCTTGATTCAAGAATTGCAAAGCTTCGAGCAAGAAGGGAAGAAACTAGCAGCCATTTGTGCG 
GCACCAATTGCCCTCAATCAAGCAGAGATATTGAAAAATAAGCGATACACTTGTTATGACGGCGTTCAAGAGCAA 

15 ATCCTTGA TGGT CACTACGTCAAGGAAACAGTAGTGGTAGATGGTCAGTTGACAACCAGTCGGGGTCCTTCAACA 
GCCCTTGCCTTTGCCTACGAGTTGGTGGAGCAACTAGGAGGGGACGCAGAGAGTTTACGAACAGGAATGCTCTAT 
CGAGATGTCTTTGGTAAAAATCAGTAA 

MVKVAVILAQGFEEIEALTVVDVLRRANITCDMVGFEEQVTGSHAIQVRADHVFDGDLSDYDMIVLPGGMPGSAHLR 
20 DNQTLIQELQSFEQEGKKLAAICAAPIALNOAEILKNKRYTCYDGVQEQILDGHYVKETVVVDGQLTTSRGPSTALAFA 
YELVEQLGGDAESLRTGMLYRDVFGKNQZ 

ID27 306 bp 

25 GTGGTAGGGATGGTAGAACCAAACCTAGAAAGCCTTATAAAAGATCTTTACAATCATGCTCGACATGATTTGAGT 
GAAGATTTAGTTGCTGCTCTCCn'AGAGACTACTAAAAAACTGCCTACTACAAATGAGCAATTGCAGGC^^ 
TCTCAGGCCTGGTCAATCGTGAATTGCTCCTAAATCCCAAACATCCAGCACCTGAGTTGCTCAACTrG 
TGTCAAAAGAGAAGAAGCCAAGTACAGAGGAACTGCGACTTCTGCGCTTATGTATGAGGAACTCTTTAAAATGCT 
TTGA 



30 



35 



MVGMVEPNLESLIKDLYNHARHDLSEDLVAALLETTKKLPTTNEQLQAVRLSGLVNRELLLNPKHPAPELLNLARFVK 
REEAKYRGTATSALMYEELFKMLZ 

ID29 945 bp 



TTGTTCTTAAAAAA GGAA AGAGAGGTAATCAGCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGT^ 
ACTACCGTTATCGGCTTTATCCTGCTTTTTGTAGGTATCCAATCTGACGGGAtTAAGAG 
AGAACCTGTCTATGATAGCCGTACGGAAAAGCTAACCTTTGGCAAGGAAGTCGAAAACCTAGAAATTACT 
CCAACACACGCTCACCATCACAGACTCTTTCGATGATCAAATCCACATTTCTTACCATCCATCTCm 

40 CATGATCTTATCACCAATCAGAACGATAGAACTCTGAGTCTCACTGATAAGAAACTGTCTGAAACTCCGTTTCTCT 
CTTCTGGAATTGGTGGGATTCTTCATATCGCAAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCT 
AAAAGGGAGAACTCTAAAAGGGATCAACATCTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGA 
AAATGCGACCCTCAATA CAAAC AGCTATATCCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAAC 
GCCCAATATCGTTAATATCTTTGATACAGTTCTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCT 

45 GAAAATATCCAAGTCCATGGCAAGGTTGAACTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAA 
AGCCAACGAATTAACTGGGACATCTCAAGCAACTATGGTTCTATCTTCCAATTCACAAGAGAAAAGCCTGAATCA 
AGAGGTACGGAATTAAGCAACCCTTACAAAACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGAT 
GATAATATTGATCTAATATCCACACCAAGCAGACGTTGA 

50 MFLKKEREVISMRKWTKGFLIFGVVTTVIGFILLFVGIQSDGIKSLLSMSKEPVYDSRTEKLTFGKEVENLErrLHQHTL^ 
TDSFDDQIHISYHPSLSAHHDLITNQNDRTLSLTDKKLSETPFLSSGIGGILHIASSYSSRFEEVILRLPKGRTLKGINISANR 
GQTTIINASLENATLNTNSYILRIEGSRIKNSKLTTPNIVNIFDTVLTDSQLESTENHFHAENIQVHGKVELTAKDYLRIILD 
QKESQRINWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQLIARSDDNIDLISTPSRRZ 

55 ID3Q 879 hp 

ATGAAACAAGAATGGTTTGAAAGTAATGATTTTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTC^ 
AGAOGTTGCAGACAAGGCTGAAGAAACGATAGCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCCCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAGACAGTGAA 

OU GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAArrCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCA CTAT AGCTGAAGAGAGCCAAGAAGCTCnTCCTCAGCAAAAAGCAACCACGAAAGAGCCACTTCTTAT 
CA GTAAAT CITTAGAAAGTCCTTATATCCCCGACCAAGCTCCAAAATCTAGGGATAAATGGAAAGAGCAAGTGCT 
TGATnTTGGTCnTGGCTAGTGGAAGC^ 
ACAGCCTTTCTCTTGCTCATTCTGTTTTCTGCATCTTCCTTT^ 

05 GGACATATAGCAAGCATTAACAGTCGCITCCCTGAGCAGCTAGCTCCTTTAACTCTTT^ 
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AGTAGCGACAACACTCnTCTTCTTTTCATTCCTCnTGGGTAGTTTCGTTGTGA 

GA CTGGACGCTAGAC AAGGTTCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATO 
IU'l'Crri'GCl"I"rC"ri"rGATAGCCTACGATTTACAGCCCTCTTGTGTGTGA 

MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEETIADLDTPIEKNTQLEEEVPQAEVELESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDI^KETEKVTIAEESQEALPQQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAKS 
PTSKLETSn-HSYTAFLLLILFSASSFFFSIYHIKHAYYGHUSINSRFPEQLAPLTLFSIISILVATrLFFFSFLLGSFVVRR^^ 
QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCVZ 

[DIPS 990 bp 

ATGCAACTCGCTTCTTCGGTCTACTCATTGTTCGTCTGGTACAATTTGTTOT 

GCATGCGTAAATGGACAAAAGGATTTCTCATCmGGTGTGGTGACTACCGrrATCGGCTTTATCCT^ 

GGTATCCAATCTGACGGGATTAAGAGCCTACTTrCCATGTCCAAAGAACCTGTCTATGAT^ 

CTAACCrTTGGCAAGGAAGTCGAAAACCTAGAAATTACTCrCCACCAACACACGCTCACCATCA^^ 

GATGATCAAATCCACATTTCITACCATCCATCrCTTTCTGCTCACCATGATCTTA 

CTCTGAGTCTCACTGATAAGAAACTGTCTGAAACTCCGTTTCTCT 

AAGTAGCTACTCTAGTCGTTTTGAAGAAGrrATTCTCCGACTACCAAAAGGGAGAACTCTAAj^ 

CTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCmGAAAATGCGACCCTCAATACAAACAGCTATAT 

CCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAACGCCCAATATCGTTAATATCTTTGATACAGTT 

CTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCTGAAAATATCCAAGTCCATGGCAAGGTTGAA 

CTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAAAGCCAACGAATTAACTGGGACATCTCAAGC 

AACTATGGTTCTATCrrCCAATTCACAAGAGAAAAGCCTGAATCAAGAGGTACGGAATTAAGCAACCCTTACAAA 

ACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGATGATAATATTGATCTAATATCCACACCAAGC 

AGACGTTGA 

MQLASSVYSLFVWYNLFLKKEREVISMRKWTKGFUFGV\TTVIGF1LLFVGIQSDGIKSLLSMSKEPVYDSRTC^ 
KEVENLErrLHQHTLTITDSFDDQIHISYHPSLSAHHDLlTNQNDRTLSLTDKKLSETPFLSSGIGGILHIASSYSSRFEEVIL 
RLPKGRTLKGINISANRGQTTIINASLENATLhrTNSYILRIEGSRIKNSKLTrPNIVNIFDTVLTDSQLESTENHFHAENIQV 
HGKVELTAKDYLRIILIXJKESQRINWDISSNYGSIFQFTREKPESRGTEI^NPYKTEKTDVKDQLIARSDDNIDLISTPSRR 



rPlO? -78bp 



ATGATATGTAAAATGAAGCAGGGAGGGAGCAGGGCGTGCTGGGGATGGAGAGTGGGGGAGGGACGCTGCTATTT 
TAATC 

MICKMKQGGSRACWGWRVGEGRCYFN 



IP109 714 bp 

CGATAAAGAGGCCTTGAGTAATCTCAATTTGCAGATTGAAAATGGAGAGATTATGGGCTTGATTGGTCATAATGG 

GGCTGGAAAATCGACCACTATAAAATCCCTAGTCAGTATCATTTCACCCAGCAGTGGTCGTATTTTGGTAGACG 

CAGGAGTTATCGGAAAATCGCTTGGCTATTAAACGAAAGATTGGCTACGTAGCAGACTCGCCTGAOT 

GCTTAACGGCCAATGA ATTTTG G GAATT GATCGCCTCATCCTATGATCTGAGTAGATCTGACIT 

AGCTAGGC TATT GAACGTTTTTGATTTTGCTGAAAATCGCTATCAGGTTATTGAAACT 

CAGAAAGTCTTTG TCAT CGGAGCACTCTTGTCTGATCCCGATATTTGGGTTTTGGACGAACCOT 

ATCCCCAGGCTGCCTTTGATTTGAAACAGATGATGAAGGAACATGCACAAAAAGGGAAGACAGTCTTGTTT^ 

CTCATGTCCTAGAGGTGGCAGAGCAAGTCTGTGATCGGATTGCCATTTTGAAAAAGGGGCATTTGATTTAnG^^ 

TAAGGTAGAGGACnTGAGGAAAGACCACCCAGACCAGTCTTTGGAAAGTATCTACCTTAGTCTTGCTGGTAG 

AGAGGAGGTTGCGGATGCGTCTCAAGGTCATTAA 

DKEALSNLNLQIENGEIMGLIGHNGAGKSTTIKSLVSIISPSSGRILVDGQELSENRLAIKRKIGYVADSPDLFLRLTANEF 
WELIASSYDLSRSDLEASLARLLNVFDFAENRYQVIETLSHGMRQKVFVIGALLSDPDIWVLDEPLTGLDPQAAFDLKQ 
MMKEHAQKGKTVLFSTHVLEVAEQVCDRIAILKKGHLIYCGKVEDLRKDHPDQSLESIYLSLAGRKEEVADASQGHZ 

ID112 360 bp 

ATGGCTTTGTTTTCAGAGAGAGGAGCAGTACGGAAGACACCAATGGCAAGTCCAATAATGAGACCTATGATGGTT 

CCGACGATAGAGATTAAAAGAGTGATACCAGCACCACGCAAGAGTTGTTGCCAGTTITCAGAAAGAATTTTAGCA 

ACTTGGCTAAAGAAACTA CTGCT AGTCTCrrCAGTTGTTGTAGCTTCGGCAGGTTGTTCOT 

TCAAGGCAACTTGGTCATCTTTTGAAATGGTTTCAATGCTGGCATTGATTTGGCTAATACGAT^ 

AGCCCGATAGCGATAGCTGTATCTTCTTCCCCAGTTTTGAAACCAGGTTCTACTTGA 
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MALFSERGAVRKTPMASPIMRPMMVPTIEIKRVIPAPRKSCCQFSERILATWLKKLLLVSSVVVASAGCSLIIRSIKATWSS 
FEMVSMLALIWLIRLSFLRSPIAIAVSSSPVLKPGST2 

5 ID 128 - 3.43 

ATGAAATTTAGTAAAAAATATATAGCAGCTGGATCAGCTGTTATCGTATC 

CTTGAGTCTATGTGCCTATGCACTAAACCAGCATCGTTCGCAGGAAAATA 

AGGACAATAATCGTGTCTCTTATGTGGATGGCAGCCAGTCAAGTCAGAAA 
10 AGTGAAAACTTGACACCAGACCAGGTTAGCCAGAAAGAAGGAATTCAGGC 

TGAGCAAATTGTAATCAAAATTACAGATCAGGGCTATGTAACGTCACACG 

GTGACCACTATCATTACTATAATGGGAAAGTTCCTTATGATGCCCTCTTT 

AGTGAAGAACTCTTGATGAAGGATCCAAACTATCAACTTAAAGACGCTGA 

TATTGTCAATGAAGTCAAGGGTGGTTATATCATCAAGGTCGATGGAAAAT 
15 ATTATGTCTACCTGAAAGATGCAGCTCATGCTGATAATGTTCGAACTAAA 

GATGAAATCAATCGTCAAAAACAAGAACATGTCAAAGATAATGAGAAGGT 

TAACTCTAATGTTGCTGTAGCAAGGTCrCAGGGACGATATACGACAAATG 

ATGGTTATGTCTTTAATCCAGCTGATATTATCGAAGATACGGGTAATGCT 

TATATCGTTCCTCATGGAGGTCACTATCACTACATTCCCAAAAGCGATTT 
20 ATCTGCTAGTGAATTAGCAGCAGCTAAAGCACATCTGGCTGGAAAAAATA 

TGCAACCGAGTCAGTTAAGCTATTCTTCAACAGCTAGTGACAATAACACG 

CAATCTGTAGCAAAAGGATCAACTAGCAAGCCAGCAAATAAATCTGAAAA 

TCTCCAGAGTCTTTTGAAGGAACTCTATGATTCACCTAGCGCCCAACGTT 

ACAGTG AATCAGATGGCCTGGTCTTTGACCCTGCTAAGATTATCAGTCGT 
25 ACACCAAATGGAGTTGCGATTCCGCATGGCGACCATTACCACTTTATTCC 

TTACAGCAAGCTTTCTGCCTTAGAAGAAAAGATTGCCAGAATGGTGCCTA 

TCAGTGGAACTGGTTCTACAGTTTCTACAAATGCAAAACCTAATGAAGTA 

GTGTCTAGTCrAGGCAGTCTTTCAAGCAATCCTTCITCTTTAACGACAAG 

TAAGGAGCTCrCTrCAGCATCTGATGGTTATATTTTTAATCCAAAAGATA 
30 TCGTTGAAGAAACGGCTACAGCTTATATTGTAAGACATGGTGATCATTTC 

CATrACATTCCAAAATCAAATCAAATTGGGCAACCGACTCTTCCAAACAA 

TAGTCTAGCAACAC(nTCTCCATCTCTTCCAATCAATCCAGGAACTTCAC 

ATGAGAAACATGAAGAAGATGGATACGGATTTGATGCTAATCGTATTATC 

GCTGAAGATGAATCAGGTTTTGTCATGAGTCACGGAGACCACAATCATTA 
35 TTTCTTCAAGAAGGACTTGACAGAAGAGCAAATTAAGGTGCGCAAAAACA 

TTTAG 

MKFSKKYIAAGSAVIVSLSLCAYALNQHRSQENKDNNRVSYVDGSQSSQK 

SENLTPDQVSQKEGIQAEQIVIKITDQGYVTSHGDHYHYYNGKVPYDALF 
40 SEELLMKDPNYQLKDADIVNEVKGGYIIKVDGKYYVYLKDAAHADNVRTK 

DEINRQKQEHVKDNEKVNSNVAVARSQGRYTTNDGYVFNPADIIEDTGNA 

YIVPHGGHYHYIPKSDLSASELAAAKAHLAGKNMQPSQLSYSSTASDNNT 

QSVAKGSTSKPANKSENLQSLLKELYDSPSAQRYSESDGLVFDPAKIISR 

TPNGVAIPHGDHYHFIPYSKLSALEEKIARMVPISGTGSTVSTNAKPNEV 
45 VSSLGSLSSNPSSLTTSKELSSASDGYIFNPKDIVEETATAYIVRHGDHF 

HYIPKSNQIGQPTLPNNSLATPSPSLPINPGTSHEKHEEDGYGFDANRII 

AEDESGFVMSHGDHNHYFFKKDLTEEQIKVRKNI* 
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TABLE 2 



»D2 840 bp 

5 ATGGGAATTGCTCTAGAAAATGTGAATTTTACATATCAAGAAGGTACT^ 

i 1 i cn I GACGATTGAAGATGGCTCrrATACAGCTTTAArrGGGCACACAGGTAGTGGTAAATCAACTATTTTACA 
ACTCTTAAATGGTTTATTGGTGCCAAGTCAAGGGAGTGTGAGGGTTTTTGATACCTT^^ 
AATAAAGATATTCGTCAAATTAGAAAACAGGTTGGCTTGGTAm 
CGGTTTTGAAGGACGTTGCTTTTGGACCGCAAAATTTTGGAGTTTCTGAAG^^ 

U GAAACTGGCTCTGGTTGGAATTGATGAATCACTTTTTGATCGTAGTCCGTTTGAGCTGT^^ 

CGTGTTGCCATTGCAGGCATACnTGCCATGGAGCCAGCTATATTAGTCTTAGATGAGCCAACAGCTGGTCTAG 
CTCTAGGGAGAAAAGAGTTGATGACCCTGTTCAAAAAACTCCACCAGTCAGGGATGACCATCGTCTTGGTAACGC 
ATTTGATGGATGA TGTTG CTGAATATGCGAATCAAGTCTATGTAATGGAAAAGGGACGTTTAGTAAAGGGGGGCA 
AACCAAGTGATGTCTTTCAAGACGTTGTTTTTATGGAAGAAGTTCAGTTGGGAGTACCTAAAATTACGGC 

0 TAAACGATTGGCTGATAGAGGCGTGTCATTTAAACGATTACCGATTAAGATAGAGGAGTTCAAGGAGTCGCTAAA 
TGGATAG 



MGIALENVNFITQEGTPLASAAl^DVSLTIEDGSYTALIGHTGSGKSTILQLLNGLLVPSQGSVRVFDTLITSTSKNKDIR 
QIRKQVGLVFQFAENQIFEETVLKDVAFGPQNFGVSEEDAVKTAREKLALVGIDESLFDRSPFELSGGQMRRVAIAGILA 
ZU MEPAILVLDEPTAGLDPLGRKELMTLFKKLHQSGMTIVLVTHLMDDVAEYANQVYVMEKGRLVKGGKPSDVFQDW 
FMEEVQLGVPKITAFCKRLADRGVSFKRLPIKIEEFKESLNGZ 

ro?^?60 bp 

25 TACCCGGTAGTCTTAGCAGACACATCTAGCTCTGAAGATGCnTTAAACATCTCrGATAAAGA 

AATAAAGAGAAACATGAAAATATCCATAGTGCTATGGAAAmCACAGGATTTTAAAGAGAAGAAAACAGCAGTC 
ATTAAGGAAAAAGAAGTTGTTAGTAAAAATCCTGTGATAGACAATAACACTAGCAATGAAGAAGCAAAAATCAA 
AGAAGAAAATTCCAATAAATCCCAAGGAGATTATACGGACTCATTTGTGAATAAAAACACAGAAAATCCCAAAAA 
AGAAGATAAAGTTGTCT ATAT TGCTGAATTTAAAGATAAAGAATCTGGAGAAAAAGCAATCAAGGAACTATCCAG 

3U TCTTAAGAATACAAAAGTTTTATATACTTATGATAGAATTTTTAACGGTAGTGCCATAGAAA 

TTGGACAAAATrAAACAAATAGAAGGTATTTCATCGGTTGAAAGGGCACAAAAAGTCCAACCCATGATGAATCAT 
GCCAGAAAGGAAATTGGAGTTGAGGAAGCTATTGATTACCTAAAGTCTATCAATGCTCCGTTTGGGAAAAATm 
GATGGTAGAGGTATGGTCATTTCAAATATCGATACTGGAACAGATTATAGACATAAGGCTATGAGAATCGATGAT 
GATGCCAAAGCCTCAATGAGATTTAAAAAAGAAGACnTAAAAGGCACTGATAAAAATTATTGGTTGAGTGATAAA 

35 ATCCCTCATGCGTTCAATTATTATAATGGTGGCAAAATCACTGTAGAAAAATATGATGATGGAAGGGATTATTTTG 
ACCCACATGGGATGCATATTGCAGGGATrenTGCTGGAAATGATACTGAACAAGACATCAAAAACTTTAACGGCA 
TAGATGGAATTGCACCTAATGCACAAATTTTCTCTTACAAAATGTATTCTGACGCAGGATCTG 
TGAAACAATGTTTCATGCTATTGAAGATTCTATCAAACACAACGTTGATGTTGTTTCGGTATCATCTGGTTT^ 
GGAACAGGTCTTGTAGGTGAGAAATATTGGCAAGCTATTCGGGCATTAAGAAAAGCAGGCATTCCAATGGTTGTC 

4U GCTACGGGTAACTATGCGACTTCTGCTTCAAGTTCTTCATGGGATITAGTAGCAAATAATCATCTGA^ 

ACACTGGAAATGTAACACGAACTGCAGCACATGAAGATGCGATAGCGGTCGCTTCTGCTAAAAATCAAACAGTTG 

AGTTTGATAAAGTTAACATAGGTGGAGAAAGrmAAATACAGAAATATAGGGGCCTTTTTCGAT^ 

TCACAACAAATGAAGATGGAACAAAAGCTCCTAGTAAATTAAAATTTGTATATATAGGCAAGGGGCAAGACCAAG 

ATTTGATAGGTTTGGATCTTAGGGGCAAAATTGCAGTAATGGATAGAATTTATACAAAGGATTTAAAAAATGC^ 

TAAAAAAGCTATGGATAAGGGTGCACGCGCCATTATGGTTGTAAATACTGTAAATTACTACAATAGAGATAATTG 

GACAGAGCrrCCAGCTATGGGATATGAAGCGGATGAAGGTACTAAAAGTCAAGTGTrrTCAATTTCAGGAGATGA 

TGGTGTAAAGCTATGGAACATGATTAATCCTGATAAAAAAACTGAAGTCAAAAGAAATAATAAAGAAGATTrTAA 

AGATAAATT GGAG CAATACTATCCAATTGATATGGAAAGTTTTAATTCCAACAAACCGAATGTAGGTGACGAAAA 

AGAGATTGACTTTAAGTTTGCACCTGACACAGACAAAGAACTCTATAAAGAAGATATCATCGTTCCAGCAGGATC 

DU TACATCTTGGGGGCCAAGAATAGATTTACTTTTAAAACCCGATGTTTCAGCACCTGGTAAAAATATT 

CTTAAT GTTA TTAATGGCAAATCAACTTATGGCTATATGTCAGGAACTAGTATGGCGACTCCAATCGTGGCAGCTT 
CTACTGTTTTGATTAGACCGAAATTAAAGGAAATGCTTGAAAGACCTGTATTGAAAAATCTTAAGGGAGA 
AAATAGATCTTACAAG TCTTA CAAAAATTGCCCTACAAAATACTGCGCGACCTATGATGGATGCAACTTCTTGGA 
AAGAAAAAAGTCAATACTTTGCATCACCTAGACAACAGGGAGCAGGCCTAATTAATGTGGCCAATGCmGAGAA 

JJ ATGAAGTTGTAGCAACmCAAAAACACTGATTCTAAAGGTTTGGTAAACTC 

AATAAAAGGTGATAAAAAATACTTTACAATCAAGCITCACAATACATCAAACAGACCnTrGACT^ 
GCATCAGCGATAACTACAGATTCTCTAACTGACAGATTAAAACTTGATGAAACATATAAAGATGAAAAATCT^ 
GArGGTAAGCAAATrGTrCCAGAAATTCACCCAGAAAAAGTCAAAGGAGCAAATATCACATTTGAGCATGATACT 
TTCACTATAGGCGCAAATTCTAGCTTTGATTTGAATGCGGTTATAAATGTTGGAGAGGCCAAAAACAAA^^^ 

OU TTTGTAGAATCATTTATTCATTITGAGTCAGTGGAAGCGATGGAAGCTCTAAACTCCAGCGGGAAGAAAATAAAC 
TTCCAACCriLii-iGTCGATGCCTCTAATGGGATITGCTGGGAATTGGAACCACGAACCAATCCTTGATAAATGGG 
CTTGGGAAGAAGGGTCAAGATCAAAAACACTGGGAGGTTATGATGATGATGGTAAACCGAAAATTCCAGGAACCT 
TAAATAAGGGAATTGGTGGAGAACATGGTATAGATAAATTTAATCCAGCAGGAGTTATACAAAATAGAAAAGATA 
AAAATACAACATCCCTGGATCAAAATCCAGAATTATTTGCirrCAATAACGAAGGGATCAACGCTCCATCATCAA 

OD GTGGTTCTAAGATTGCTAACATTTATCCTTTAGATrCAAATGGAAATCCTCAAGATGCTCAACT^ 
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AACACCTTCTCCACTTGTATTAAGAAGTGCAGAAGAAGGATTGATTTCAATAGTAAATACAAATAAAGAGGGAGA 
AAATCAAAGAGACTTAAAAGTCATTTCGAGAGAACACTTTATTAGAGGAATTTTAAATTCTAA^ 
AAAGGGAATCAAATCATCTAAACTAAAAGTTTGGGGTGACITGAAGTGGGATGGACTCATCTATAATCCTAGAGG 
TAGAGAAGAAAATGCACCAGAAAGTAAGGATAATCAAGATCCTGCTACTAAGATAAGAGGTCAATTTGAACCGAT 
J TGCGGAAGGTCAATATTTCTATAAATTTAAATATAGATTAACTAAAGATTACCCATGGCAGGTTTCCTAT^ 
GTAAAAATTGATAACACCGCCCCTAAGATTGTTTCGGTTGATTTTTCAAATCCTGAAAAAATTAAG^^ 
AGGATACTTATCATAAGGTAAAAGATCAGTATAAGAATGAAACGCTATTTGCGAGAGATCAAAAAGAACATCCTG 
AAAAATTTGACGAGATTGCGAACGAAGTTTGGTATGCTGGCGCCGCTCTTGTTAATGAAGATGGAGAGGTTGA^ 
AAAATCTTGAAGTAACrrACGCAGGTGAGGGTCAAGGAAGAAATAGAAAACTTGATAAAGACGGAAATACCATTT 

lU ATGAAATTAAAGGTGCGGGAGATTTAAGGGGAAAAATCATTGAAGTCATTGCATTAGATGGTTCTAGCAATTTCA 
CAAAGATTCATAGAATTAAATTTGCTAATCAGGCTGATGAAAAGGGGATGATTTCCTATTATCTAGTAGATCCT^ 
TCAAGATTCATCTAAATATCAAAAGCTTGGCGAGATTGCAGAATCTAAATTTAAAAATTTAGGAAATGGAA 
GGG TAGTC TAAAAAAAGATACAACTGGGGTAGAACATCATCATCAAGAAAATGAAGAGTCTATTAAAGAAAAAT 
CTAGTTTTACTATTGATAGAA ATATT rCAACAATTAGAGACTTTGAAAATAAAGACT^ 

15 GAAATTTAGAGAAGTTGATGATTTTACAAGTGAAACTGGTAAGAGAATGGAGGAATACGATTATAAATACGATGA 
TAAAGGAAATATAATAGCCTACGATGATGGGACTGATCTAGAATATGAAACTGAGAAACTTGACGAAATCAAATC 
AAAAATTTATGGTGTTCTAAGTCCGTCTAAAGATGGACACmGAAATTCTTGGAAAGAT^ 
AATGCCAAGGTATATTATGGGAATAACTATAAATCTATAGAAATCAAAGCGACCAAGTATGATTTCCACTCAAAA 
ACGATGACATTTGATCTATACGCTAATATTAATGATATTGTGGATGGArrAGCTTTTGCAGGAGATATGAGATT^ 

20 TTGTTAAAGATAATGATCAGAAAAAAGCTGAAATTAAAATTAGAATGCCTGAAAAAATTAAGGAAACTAAATC^^ 
AATATCCCTATGTATCAAGTTATGGGAATGTCATAGAATTAGGGGAAGGAGATCTTTCAAAAAACAAACCAGACA 
ATTTAACTAAAATGGAATCTGGTAAAATCTATTCTGATTCAGAAAAACAACAATATCTGTTAAAGGATAAT^^ 
TCTAAGAAAAGGCTATGCACTAAAAGTGACTACCTATAATCCTGGAAAAACGGATATGTTAGAAGGAAATGGAGT 
CTATAGCAAGGAAGATATAGCAAAAATACAAAAGGCCAATCCTAATCTAAGAGCCCTTTCAGAAACAACAATTTA 

Id tgctgatagtagaaatgttgaagatggaagaagtacccaatctgtattaatgtcggctttggacggctttaacat^ 
ataaggtatcaagtgtttacatttaaaatgaacgataaaggggaagctatcgataaagacggaaatcttgtgaca 

GATTCTTCTAAACTTGTATTATTTGGTAAGGATGATAAAGAATACACrGGAGAGGATAAGTTCAATC^^ 
TAAAAGAAGATGGCTCCATGTTATTTATTGATACCAAACCAGTAAACCTTTCAATGGATAAGAACTACTI^ 
ATCTAAATCTAATAA AATTTATGTACG AAATCCAGAATTTTATTTAAG AGGTAAG ATTTCTG AT AAGGGTC 

iU AACTGGGAATTGAGAGTTAATGAATCGGTTGTAGATAATTATTTAATCTACGGAGATT^ 

GAGATTTTAATATTAAGCrGAATGTTAAAGACGGTGACATCATGGACTGGGGAATGAAAGACTATAAAGCAAA^^ 
GATITCCAGATAAGGTAACAGATATGGATGGAAATGTTTATCnTCAAACTGGCTATAGCGATTTGAATGCT 
AGTTGGAGTCCACTATCAGTTTTTATATGATAATGTTAAACCCGAAGTAAACATTGATCCTAAGGGAAATACT^ 
ATCGAATATGCTGATGGAAAATCTGTAGTCTTTAACATCAATGATAAAAGAAATAATGGATTCGATGGTGAGATT 

35 CAAGAACAACATATTTATATAAATGGAAAAGAATATACATCATTTAATGATATTAAACAAATAATAGACAAGACA 
CTAAACATTAAGATTGTTGTAAAAGATTTTGCAAGAAATACAACCGTAAAAGAATTCATTTTAAAT^ 
GGAGAGGTAAGTG AATTA AAACCTCATAGGGTAACTGTGACCATTCAAAATGGAAAAGAAATGAGTTCAACGATA 
GTGTCGGAAGAAGATTTTATTITACCTGTTrATAAGGGTGAATTAGAAAAAGGATACCAAm 
TTTCTGGTTTCGAAGGTAAAAAAGACGCTGGCTATGTTATTAATCTATCAAAAGATACCm 

4U CAAGAAAATAGAGGAGAAAAAGGAGGAAGAAAATAAACCTACTTTTGATGTATCGAAAAAGAAAGATAACCCAC 
AAGTAAACCATAGTCAATTAAATGAAAGTCACAGAAAAGAGGATTTACAAAGAGAAGAGCATTCACAAAAATCT 
GATTCAACTAAGGATGTTACAGCTACAGTTOTGATAAAAACAATATCAGTAGTAAATCAACTACTAACAATCCT 
AATAAG TTGCCAAAAACTGGAACAGCAAGCGGAGCCCAGACACTATTAGCTGCCGGAATAATGTTTATAGTAGGA 
ATTTTTCTTGGATTGAAGAAAAAAAATCAAGATTAA 

45 

YPVVLADTSSSEDALNISDKEKVAENKEKHENIHSAMETSQDFKEKKTAVIKEKEVVSKNPVIDNNTSNEEAKIKEENSN 
KSQGDYTDSFVNKhrrENPKKEDKVVYIAEFKDKESGEKAIKEl^SLKhrrKVLYTYDRIFNGSAlETTPDNLDKIKQIEGK 
SVERAQKVQPMMNHARKEIGVEEAIDYLKSINAPFGKNFDGRGMVISNIDTGTDYRHKAMRIDDDAKASMRFKKEDL 
KGTDKNYWLSDKIPHAFNYYNGGKITVEKYDDGRDYFDPHGMHIAGILAGNDTEQDIKNFNGIDGIAPNAQIFSYKMY 
5U SDAGSGFAGDETMFHAIEDSIKHNVDVVSVSSGFTGTGLVGEKYWQAIRALRKAGIPMVVATGNYATSASSSSWDLVA 
NNHLKMTDTGNVTRTAAHEDAIAVASAKNQTVEFDKVNIGGESFKYRNIGAFFDKSKITTNEDGTKAPSKLKFVYIGK 
GQDQDLIGLDLRGKIAVMDRIYTKDLKNAFKKAMDKGARAIMVVNTVNYYNRDNWTELPAMGYEADEGTKSQVFSI 
SGDDGVKLWNMINPDKKTEVKRNNKEDFKDKLEQYYPIDMESFNSNKPNVGDEKEIDFKFAPDTDKELYKEDIIVPAG 
STSWGPRIDLLLKPDVSAPGKNIKSTLNVINGKSTYGYMSGTSMATPIVAASTVLIRPKLKEMLERPVLKNLKGDDKIDL 
55 TSLTKIALQNTARPMMDATSWKEKSQYFASPRQQGAGLINVANALRNEVVATFKNTDSKGLVNSYGSISLKEIKGDKK 
YFTIKLHNTSNRPLTFKVSASAITTDSLTDRLKLDETYKDEKSPDGKQIVPEIHPEKVKGANrrFEHDTFTIGANSSFDLN 
AVINVGEAKNKNKFVESFlHFESVEAMEALNSSGKKINFQPSLSMPtMGFAGNWNHEPILDKWAWEEGSRSKTLGGYD 
DDGKPKIPGTLNKGIGGEHGIDKFNPAGVIQNRKDKNTTSLDQNPELFAFNNEGINAPSSSGSKIANIYPLDSNGNPQDA 
QLERGLTPSPLVLRSAEEGLISIVNTNKEGENQRDLKVISREHFIRGILNSKSNDAKGIKSSKLKVWGDUCWDGUYNPRG 
REENAPESKDNQDPATKIRGQFEPIAEGQYFYKFKYRtTKDYPWQVSYIPVKIDNTAPKIVSVDFSNPEKlKLITKDTYHK 
VKDQYKNETLFARDQKEHPEKFDEIANEVWYAGAALVNEDGEVEKNLEVTYAGEGQGRNRKLDKDGNTIYEnCGAG 
DLRGKIIEVIALDGSSNFTKIHRIKFANQADEKGMISYYLVDPDQDSSKYQKLGEIAESKFKNLGNGKEGSLKKDTTGVE 
HHHQENEESnCEKSSFTIDRNISTIRDFENKDLKKLIKKKFREVDDFTSETGKRMEEYDYKYDDKGNIIAYDDGTDLEYE 
TEKLDEIKSKIYGVLSPSKDGHFEILGKISNVSKNAKVYYGNNYKSIEIKATKYDFHSKTMTFDLYANINDIVDGLAFAG 
DMRLFVKDNDQKKAEIKIRMPEKIKETKSEYPYVSSYGNVIELGEGDLSKNKPDNLTKMESGKIYSDSEKQQYLLKDNII 
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LRKGYALKVTTYNPGKTDMLEGNGVYSKEDIAKIQKANPNLRALSETTIYADSRNVEDGRSTQSVLMSALDGFNIIRYQ 

VFTFKMNDKGEAIDKDGNLVTDSSKLVLFGKDDKEYTGEDKIWEAIKEDGSMLFIDTKPVNLSMDKNYFNPSKSNKI 

YVRNPEFYLRGKISDKGGFNWELRVNESVVDNYLIYGDLHIDNTRDFNIKLNVKDGDIMDWGMKDYKANGFPDKVTD 

MDGNVYLQTGYSDLNAKAVGVHYQFLYDNVKPEVNIDPKGNTSIEYADGKSVVFNINDKRNNGFDGEIQEQHIYINGK 

EYTSFNDIKQIIDKTLNnCIVVKDFARNTTVKEFILNKDTGEVSELKPHRVTVTIQNGKEMSSTIVSEEDFILPVYKGELEK 

GYQFDGWEISGFEGKKDAGYVINLSKDTFIKPVFKKIEEKKEEENKPTFDVSKKKDNPQVNHSQLNESHRKEDLQREEH 

SQKSDSTKDVTATVLDKNNISSKSTTNNPNKLPKTGTASGAQTLLAAGIMFIVGIFLGLKKKNQDZ 

ID6 597 bp 

CTTGAATTAAATAAAAAACGTCATGCGACTAAGCATTTTACTGATAAGCTTGTTGATCCCAAAGATG^ 

CTATCGAAATTGCAACCTTAGCGCCAAGCGCCCACAACAGCCAGCCTTGGAAATTTGTGGTGGTACGTGAGAAAA 

ATGCTGAACTGGCAAAGTTAGCTTATGGTTCCAATTTTGAACAGGTATCATCAGCGCCTGTAACCATO 

TACAGAT ACGGA CirAGCCAAACGTGCTCGTAAGATTGCCCGTGTTGGTGGTGCTAATAACTTTTCT^ 

CnrCAATATTTTATGAAAAATCTGCCAGCTGAGriTGCCCGTTACAGTGAGCAACAAGTCAGCGACTAC 

TCAA TGCAGGTTTGGTTGCCATGAACTTGGTTCTTGCATTGACAGACCAAGGAATTGGTTCTAACATTA 

TTTTGACAAATCAAAAGTTAATGAAGTTTTGGAAATCGAAGACCGTTTCCGCCCAGAACTCTTGATCACA 

TATACAGACGAAAAATTGGAACCAAGCTACCGCTTGCCAGTAGATGAAATCATCGAGAAAAGATAG 

LELNKKRHATKHFTDKLVDPKDVRTAIEIATLAPSAHNSQPWKFVVVREKNAELAKLAYGSNFEQVSSAPVTIALFTDT 
DLAKRARKIARVGGANNFSEEQLQYFMKNLPAEFARYSEQQVSDYLALNAGLVAMNLVLALTDQGIGSNIILGFDKSK 
VNEVLEIEDRFRPELLITVGYTDEKLEPSYRLPVDEIIEKRZ 

ID7 1401 bp 

ATGACAGCAATTGATTTTACAGCAGAAGTAGAAAAACGCAAAGAAGACCTCTTGGCTGACTTGm 

GAAATCAATTCAGAACGTGATGACAGCAAGGCTGATGCCCAGCATCCATTTGGGCCTGGTCCAGTAAAAGCCTTG 

GAGAAATTCCrrGAAATCGCAGACCGCGATGGCTACCCAACTAAGAATGTTGATAACTATGCAGG 

TTTGGTGATGGAGAAGAAGTTCTCGGAATCmGCCCATATGGATGTGGTGCCTGCTGGTAGCGGTTGGGACACAG 

ACCCTTACACACCAACTATCAAAGATGGTCGCCTTTATGCGCGCGGGGCTTCGGACGATAAGGGTCCTAC^ 

CTTGTTACTATGGTTTGAAAATCATCAAAGAATTGGGTCrTCCAACTTCTAAGAAAGTTCGOT 

AGACGAAGAATCAGGCTGGGCAGACATGGACTACTACTTTGAGCACGTAGGACTTGCCAAACCAGATTTCGGT^ 

CTCACCAGATGCTGAArrrCCAATCATCAATGGTGAAAAAGGAAATATCACGGAATACCTCCACTTTGCAGGAGA 

AAATACAGGTGTTGCCCGTCTTCACAGCTTTACAGGTGGTTTACGTGAAAATATGGTACCAGAATCAGCAACAG 

AGTCGTTTCAGGTGACTTGGCTGACTTGCAAGCTAAACTAGATGCCTTTGTTGCAGAACACAAAm 

ACTCCAAGAAGAAGCTGGCAAATACAAGGTGACGATCATTGGTAAATCAGCCCACGGTGCTATGCCTGCTTC^^ 

TGTCAATGGCGCAACITACCITGCCCTCTTCCTCAGCCAGTTTGGCTTTGCT^ 

ATCGCAGGTAAAATTCTCTTGAACGATCATGAGGGTGAAAATCTTAAGATTGCT 

GCTCTTTCTATGAATGCCGGCGTCTTCCACTTCGATGAAACAAGTGCTGATAATACCATTGCCCTCAACATC^ 

ATCCAAAAGGAACAAGTCCAGAACAAATCAAGTCAATCCnrGAAAACTTGCCAGTTGTTTCnXj 

ACACGGTCACACGCCTCACTATGTGCCAATGGAAGATCCACTTGTGCAy^ 

AACTGGCTTTAAAGGTCATGAACAAGTCATCGGTGGTGGAACCTTTGGTCGCTTGCTAGAACGCGG^^ 

CGGTGCrATGTTCCCAGACTCGATTGATACCATGCACCAAGCCAATGAATTTATCGCCTTGGATGATCrm 

GCAGCAGCAATTTATGCCGAAGCTATTTACGAATTGATCAAATAA 

MTAIDFTAEVEKRKEDLLADLFSLLEINSERDDSKADAQHPFGPGPVKALEKFLEIADRDGYPTKNVDNYAGHFEFGD 

GEEVLGIFAHMDVVPAGSGWDTDPYTPTIKDGRLYARGASDDKGPTTACYYGLKIIKELGLPTSKKVRFIVGTDEESGW 

ADMDYYFEHVGLAKPDFGFSPDAEFPIINGEKGNrrEYLHFAGENTGVARLHSFTGGLRENMVPESATAVVSGDLADL 

QAKLDAFVAEHKLRGELQEEAGKYKVTIIGKSAHGAMPASGVNGATYLALFLSQFGFAGPAKDYLDIAGKILLNDHEG 

ENLKIAHVDEKMGALSMNAGVFHFDETSADNTIALNIRYPKGTSPEQIKSILENLPVVSVSLSEHGHTPHYVPMEDPLVQ 

TLLNIYEKQTGFKGHEQVIGGGTFGRLLERGVAYGAMFPDSIDTMHQANEFIALDDLFRAAAIYAEAIYELIKZ 

IDS 1617 bp 

GTGTATACTATTATAAAATCAAATATAAAAAAATTTAGTrrATTAACGATATTTATTGTTGCTGGTCAA 

AATTTATGCAGCAACTATTAATGCTCTGGTGTTGAATGAATTAATTGCGATGAATTTAGAGCGGTTTT^ 

TCAATCTACCAAATGATTGTCTGGTGTGGGATAATATTCCTTGACTGGGTAGTGAAAAATTATCAGGTTGAAGTGA 

TCCAAGAGTTTAATCTAGAGATTCGAAATAGAGTTGCCACAGACATCTCTAACTCTACCTATCAAGAATTTCAT^^ 

TAAAT CATCAGGAACATATCTTTCGTGGCTAAATAATGATGTTCAGACTTTAAATGATCAGGCGTTTA^ 

TTTTTAGTAA TAAA AGGAATTTCTGGTACTATATTTGCAGTTGTGACTCTTAATCACTATC^ 

AGCCACCTTGTTTTCATTAATG ATTATGCT ACTTGTACCAAAAATCTTTGCATCGAAA^ 

AATTTAACTAACCAAAATGAAGCin-ri'ri'AAAATCTAGTGAGACTATATTGAATGGATITGATGTGTTAGCGTCCT 

TGAATCTTTTATATGTATTGCCTAAGAAAATTAAAGAAGCAGGAArrTTATTAAAGATGGT^^ 

CAACTGTAGAAACGTTAGCAGGCGCTATTAGCTTCTTTCTCAATATTTTTm 

GGCTATCTTGCAATAAAAGGAATAGTGAAAATTGGTACrATTGAAGCAATAGGAGCACTAACAGGTGTTAT^^ 
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ACAGCGCTAGGTGAATTAGGAGGTCAATTATCCTCTATTATTGGTACGAAGCCrATTTTTT^ 
TTAATCCAATTGAGTCAAATAAAATGAATGATATCGAACCAAATGAGGTGAATAGAGATTTTCCGTTATATGAAG 
CAAAAAA TATTT GCTATAAGTATGGAGATAAAGAAATATTAAAAAACTTAAATTTTTGTTTTCAACG 
GTATTTAATTITAGGTGAAAGTGGAAGCGGGAAATCTACATTATTAAAATTATTGAATGGCTT^ 
5 AGTGGAGAATTGCGATTCTGCGGGGATGATATAAAAAAAACCTCCTATTTAAATATGGTTTCGAATGTTCTATATC 
TAGATCAAAAAGCTTATTTGTTTGAAGGTACGArrAGAGATAATATTTTAT^ 
AATACTACAGTCTTTAGAGCAAGTTGGTTTGAGTGTAAAAGATTTTCCTAATAACATTT^ 

GATGATGGGAGATTACTGTCAGGAGGGCAGAAACAAAAAArrACTTTAGCrAGAGGGCTAArrAGAAATAAGAA 
AATAGTATTAATTGACGAGGGAACTTCTGCTATCGATAGGAGAACTTCGTTAGCGATTGAACGTAAGATATTAGA 
10 TAGAGAGGATTTGACTGTCATTATTGTTACCCATGCTCCGCATCCGGAACnTAAACAATATTTTA^^ 
CAATTTCCAAAGGATTTTATTTAA 

MYTIIKSNKKFSLLTIFIVAGQLLLIYAATINALVLNELIAMNLERFLKl^IYOMlVWCXjnFLDWVVKNYQVEVIQEFNL 
EIRNRVATDISNSTYQEFHSKSSGTYLSWLNNDVQTLNDQAFKQLFLVIKGISGTIFAVVTLNHYHWSLTVATLFSLMIM 
15 LLVPKIFASKMREVSLNLTNQNEAFLKSSETILNGFDVLASLNLLYVLPKKIKEAGILLKMVIQRKTTVETLAGAISFFLNI 
FFQISLVFLTGYIjVIKGIVKIGTIEAIGALTGVIFTALGELGGQLSSIIGTKPIFUCLYSINPIESNKMNDIEPNEVNRDFPLYE 

aknicykygdkeilknlnfcfqrnekylilgesgsgkstllkllngflrdysgelrfcgddikktsylnmvsnvlyvdq 
kaylfegtirdnilleenytdeeilqsleqvglsvkdfpnnildyyvgddgrllsggqkqkitlargurnkkivlidegt 
saidrrtslaierkildredltviivthaphpelkqyftkiyqfpkdfiz 

ID9 705 bp 

ataacagttaaacagattatggacgaaatagccgtttcagatatgactgcaaggcgctatttacaggaattagct 
gataaagatttgctgattcgtgtgcatggtggagctgaaaaacttcgaaccaactcccttttg 

25 caaatattgaaaaacaagccctccaaacggcagaaaaacaagaaatagcccattttgcaggcagtctagtagaa 
gaaagagaaactattttcattggaccaggaacaacattagagttttttgcgcgtgagttgcctato 
gcgtcgtaaccaacagtctacctgtttttctgattttaagcgaacgaaaattaacagatttgat^ 
a aatt atcgcgatattacaggtgctmgttggtacattgaccctacaaaatctctctaatctccaatt^ 
gctt tcgt tagctgtaatggtattcaaaacggagctctagctacttttagcgaggaagagggagag 

30 atcgctttaaataattctaataaaaaatatttactcgcagatcatagcaagttcaataagtttgat^^ 
ttataatgtatcaaatcttgatactattgtttcagattctaaactaagtgattcaatccttm 
acattaaagtcatcaagccttaa 

itvkqimdeiavsdmtarrylqeladkdllirvhggaeklrtnslltnersniekqalqtaekqeiahfagslveereti 
35 figpgttleffarelpidnirvvtnslpvflilserkltdliliggnyrditgafvgtltlqnlsnlqfskafvscngiqnga 
latfseeegeaqrialnnsnkkylladhskfnkfdfytfynvsnldtivsdsklsdsilfklskhikvikpz 

IDIO 483 bp 

40 atgactgagttttcgttagatcttcttctagaagccattaaactagctcgttggacctactact^ 

agctagacaaaacagataaagaccaagagcttaaaactgaaattcaatccatctttatcgaacacaagggaaat^ 
atgcttatcgccgggttcatttagaactaagaaatcgtggttatctggtaaatcataaaagagttcaaggcttgat 
gaaagtactcaatttacaagctaaaatgcgaaagaaacgaaaatattcttctcataaaggagacgttggtaagaa 
ggcagagaatctcattcaagcccaatttgaaggctctaaaacaatggaaaagtgctacacagatgtgactgaatt 

45 tgccattccagcaagtactcaaaagcttracttatcaccagttttagatgg 

AATCTTTCTTGTTCGCCTAATTTAGAATAA 

mtefsldllleaiklarwtyyyhucqldktdkdqelkteiqsifiehkgnyayrrvhlelrnrgylvnhkrvqglmk 
vlnlqakmrkkrkysshkgdvgkkaenliqaqfegsktmekcytdvtefaipastqklylspvldgfnseiiafnlscs 
50 pnlez 

ID14 1266 bp 

ccaggatttggtaccgttgcaagtggtgtgcctttcctcctaaaggaaaatggaggaaaaatcaatcaatcagca 
55 cattcaga tatc aaagttgctaaggtattggtcaaggatgaagatgaaaaaaatcgcttgcttgcagcagggaat 
gactttaacittgtaaccaatg^ 

gtattgagcctgctaaaaccirratcactcgtgccntggaagctggaaaacacgttgttact^ 

ttragcrgtccatggcgcagaattgctagaaatcgctcaagctaacaaggtagcactttactacgaagcagcagt 

tgctggtgggattccaattcttcgtacrrtagcaaattccttggcttctgataaaattacgcgcg^ 

60 gtcaacggaacttccaacttcatggtgaccaagatggtggaagaaggctggtcttacgatgatgctcttgcggaa 
g caca acgtctaggat ttgca gaaagcgatccgacgaatgacgtagatgggattgatgcagcctacaagatggtt 
attttgagccaatttgccmggcatgaagattgcctttgatgatgtagcccacaagggaatccgc^ 
cagaagacgtagctgtagctcaagagcttggttacgtagtgaaattggttggttctattgaggaaacntct^ 
ta ttgc tgcagaagtgactccaaccttcctacctaaagcgcacccacntgctagtgtgaatggcgtaatgaacgct 

65 gtctttgtagaatctatcggtattggtgagtctatgtactacggaccaggtgcgggtcaaaaaccaactgcaaca 
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AGTGTTGTAGCTGATATTGTCCGTATCGTTCGTCGTTTGAATGATGGTACTATTGGCAAAGACTTCAACGAAT^^ 
GCCGTGACTTGGTCTTGGCAAATCCTGAAGATGTCAAAGCAAACTACTATTTCTCAATCnTGGCTCTAGACT^ 
AGGTCAGGTCTTGAAGTTGGCTGAAATCTTCAATGCTCAAGATATTTCCTTTAAGCAAATCCTTCAAGATGGC 
GAGGGTGACAAGGCGCGTGTCGTTATCATCACACACAAGATTAATAAAGCCCAGCITGAAAATGTCTCAGCTGAA 
5 TTGAAGAAGGTTTCAGAATTCGACCTCTTGAATACCTTCAAGGTGCTAGGAGAATAA 

PGFGTVASGVPFLLKENGGKINQSAHSDIKVAKVLVKDEDEKNRLLAAGNDFNFVTNVDDILSDQDITIVVELMGRIEP 
AKTFITRALEAGKHVVTANKDLLAVHGAELLEIAQANKVALYYEAAVAGGIPILRTLANSLASDKITRVLGVVNGTSNF 
MVTKMVEEGWSYDDALAEAQRLGFAESDPTNDVDGIDAAYKMVILSQFAFGMKIAFDDVAHKGIRNITPEDVAVAQE 
10 LGYVVKLVGSIEETSSGIAAEVTPTFLPKAHPLASVNGVMNAVFVESIGIGESMYYGPGAGQKPTATSVVADIVRIVRRL 
NDGTIGKDFNEYSRDLVLANPEDVKANYYFSILALDSKGQVLKLAEIFNAQDISFKQILQDGKEGDKARVVIITHKINKA 
QLENVSAELKKVSEFDLLNTFKVLGEZ 

ID16 1725 bp 

15 

ATGAAACACCTATTATCTTACTTCAAACCCTACATCAAGGAATCAATTTTAGCCCCOT 
CTGTTTTTGAGCTCTTGGTTCCCATGGTGATrGCTGGGATTGTTGACCAATCm 
TCTCTGGATGCAGATTGGCCTGCTCCTTATCTTTGCAGTAATTGGCGTTTTAGTGGCOT 
CAGCAAAGGCAGCAGTAGGTTCTGCrAAGGAATTGACAAACGATCTTTATCGTCATATTCTTTCCTTGCCC^ 
20 CAGCAGAGACCGTCTG ACAACIT Cn'AGTTTGGTCACTCGCTTGACTrCGGATACCTACCAGATTCAGACTGG 
AATCAATTCCTGCGTCTCrrmACGAGCGCCCATTATCGTTmGGTGCCATTTTTATGGC^ 
TGAG TTGACTTTCTGGTTCTTAGTCTTGGTTGCCATTTTGACCATTGTCATTGTAGGGTTAT 

CTTTCTACA GTAGT CTCAGAAAGAAAACGGACCAACTGGTTCAGGAAACGCGCCAGCAATTGCAAGGGATGCGGG 
TTATTCGTGCnTTTGGTCAAGAAAAACGAGAGTTACAGATTTTTCAAACCCTTAACCAAG^ 

25 AGAAAAGACAGGTTTCTGGTCTAGTTTATTAACACCTCTGACCTATCTGATTGTCAATGGAACTOT 

ATCTGG CAAG GCTATATTTCAATTCAAGGAGGAGTGCTCAGTCAAGGTGCTCTCATTGCTCTTATCAATTA 
TACAGATTTTGG TGGA ATTGGTCAAGCTAGCCATGTTGATCAATTCCCTCAACCAGTCCTATATCTCAGTCAAGCG 
AA TCGA GGAAGTCTTTGTTGAGGCTCCAGAGGATATCCATTCAGAGTTAGAACAAAAGCAAGCTACCAGAGATAA 
GGTTTTACAAGTCCAAGAATTGACCTTTACCTATCCTGATGCGGCCCAGCCITCTCTGAGATACAT^ 

30 ATGACrCAAGGACAAATTCTAGGTATCATCGGGGGAACTGGTTCTGGTAAATCAAGCITGGTGCAACT 

GACTTTATCCAGTAGACAAGGGGAACATTGACCTTTATCAAAATGGACGTAGTCCTCTTAATTTGGAGCAGTC 
GGTCTTGGATTGCCTATGTACCTCAAAAGGTCGAACTCTITAAAGGAACCATTCGTTCCAACTTGACTCTAGGm 
CAATCAAGAAGTATCTGACCAGGAACTCTGGCAGGCCTTGGAGATTGCGCAAGCTAAGGATTTTGTCAGTGAA^ 
GGAAGGACTCTTGGATGCTCTAGTTGAGGCAGGGGGGCGAAATTTCTCAGGTGGACAAAAACAAAGATTGTCTAT 

35 CGCCCGAGCAGTCTTGCGCCAGGCTCCGTTTCTCATCCTAGATGATGCAACCTCGGCACTGGATACCATTACAGAG 
TCCAAGCTCTTGAAAGCTATTAGAGAAAATTTTCCAAACACGAGCTTAATTTTGATCTCTCAACGi^ 
TACAGATGGCGGACCAGATTCTCCTCrrGGAAAAAGGTGAGTTGCTAGCTGTTGGCAAGCACGATGACTTGATGA 
AATCCAGCCAAGTCTATTGTGAAATCAATGCATCCCAACATGGAAAGGAGGACTAG 

40 MKHLLSYFKPYIKESILAPLFKLLEAVFELLVPMVIAGIVDQSLPQGDQGHLWMQIGLLLIFAVIGVLVALIAQFYSAKA 
AVGSAKELTNDLYRHILSLPKDSRDRLTTSSLVTRLTSDTYQIQTGINQFLRLFLRAPIIVFGAIFMAYRISAELTFWFLVL 
VAILTIVIVGLSRLVNPFYSSLRKKTDQLVQETRQQLQGMRVIRAFGQEKRELQIFQTLNQVYARLQEKTGFWSSLLTPL 
TYLIVNGTLLVIIWQGYISIQGGVLSQGALIALINYLLQILVELVKLAMLINSLNQSYISVKRIEEVFVEAPEDIHSELEQKQ 
ATRDKVLQVQELTFTYPDAAQPSLRYISFDMTQGQILGnGGTGSGKSSLVQLLLGLYPVDKGNIDLYQNGRSPLNLEQ 

45 WRSWIAYVPQKVELFKGTIRSNLTLGFNQEVSDQELWQALEIAQAKDFVSEKEGLLDALVEAGGRNFSGGQKQRLSIA 
RAVLRQAPFULDDATSALDTITESKLUCAIRENFPNTSLILISQRTSTLQMADQILLLEKGELLAVGKHDDLMKSSQVYC 
EINASQHGKEDZ 

ID18 ;224bp 

ATGAAACGTTCTCTCGACTCAAGAGTCGATTACAGTTTGCTCTTGCCAGTATTTTTTCT^ 

GGCTATCTATATAGCCGTTAGTCATGATTATCCCAATAATATTCTGCCCATTTTAGGGCAGCAGGTCGCCTGGATT 
G CCTTG GGGCTTGTGATTGGTTTTGTGGTCATGCTCTTTAATACAGAATTTCT^ 
TATTTTAGGCTTGGGACTTATGATCTTGCCGATTGTATTTTATAATCCAAGCT^ 
55 AACTGGGTATCAATAAATGGAATTACCCTATTCCAACCGTCAGAATTTATGAAGATATCCTATATCCTCATGTTGG 
CTCGTGTCATTG TCCA ATTTACAAAGAAACATAAGGAATGGAGACGCACGGTTCCGCTGGACTTTTTGT^^ 
CTGGATGATTCTCTTTACCATTCC^ 

TAGCCATTTTCTCAGGAATCGTTTTATTATCAGGGGTTTCTTGGAAAATTATTATCCCAGTATT^ 
ACAGGAGTTGCTGGTTTCTTAGCTATCTTTATTAGCAAGGACGGACGAGCTTTTCTTC^ 
OU CCTACCAAATTAATCGGATTTTGGCTTGGCTCAATCCCTTTGAGTTTGCCCAAACAACGACT^ 

AGGGCAGATTGCC ATTGG GAGTGGTGGCTTATTTGGTCAGGGATTTAATGCTTCGAATCTGCTTATCC 
GAGTCAGATATGATTTTTACGGTTATTGCAGAAGATTTTGGCTTTATTGGCTCTGTCCrGGT^ 
CATGTTGATTTACCGTATGTTGAAGATTACTCITAAATCAAATAACCAGTTCTACACTTATATTTCCACAGGT^ 
^ T TATG ATGTTGCTCTTCCACATCTTTGAGAATATCGGTGCTGTGACTGGACTACTTCCTT^ 
05 CCTTTCATTTCGCAAGGGGGATCAGCTATTATCAGTAATCTGATTGGTGTTGGTTTGCTm 
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GACTAATCTAGCTGAAGAAAAGAGCGGAAAAGTCCCATTCAAACGGAAAAAGGTTGTATTAAAACAAATTAAATA 
A 

MKRSLDSRVDYSLLLPVFFLLVIGVVAIYIAVSHDYPNNILPILGQQVAWIALGLVIGFVVMLFNTEFLWKVTPFLYILGL 
D GLMILPIVFYNPSLVASTGAKNWVSINGITLFQPSEFMKISYILMLARVIVQFTKKHKEWRRTVPLDFLLIFWMILFTIPVL 
VLLALQSDLGTALVFVAIFSGIVLLSGVSWKIIIPVFVTAVTGVAGFLAIFISKDGRAFLHQIGMPTYQINRILAWLNPFEF 
AQTTTYQQAQGQIAIGSGGLFGQGFNASNLLIPVRESDMIFTVIAEDFGFIGSVLVIALYLMLIYRMLKITLKSNNQFYTY 
ISTGLIMMLLFHIFENIGAVTGLLPLTGIPLPFISQGGSAIISNLIGVGLLLSMSYQTNLAEEKSGKVPFKRKKVVLKQIK2 

10 1D22 987 bo 

ATGGTGGCTAAGAAAAAAATCTTATTTTTTATGTGGTCTTTTrCTCTT'GGAGG 

CCATTGTTTCAAATCTGGATCCAGAAAAGTATGATATrGATATTCTTGAAATGGAGCACTTTGACAAGGGATATGA 
ATCTGTTCCAAAGCATGTAraCATTTTAAAATCCCITCAAGATTATCGCCAAACCAGATG^ 

IJ TGGAGAATGAGAATTTATTTTCCAAGACTGACTCGTCGTTTGCTTGTAAAAGATGATTATGATGTTGAAGm 

TTACCATTATGAATCCACCACTGTTGTTCTCTAAAAGAAGAGAAGTCAAGAAGATATCTTGGATTCATGGAAGTAT 
TGAAGAACTTCTTAAGGATAGCTCTAAAAGAGAATCACATAGAAGCCAGTTGGATGCTGCGAATACAATTGTAGG 
GATTTCAA AAAA GACCAGCAATTCTATCAAGGAAGTTTATCCAGATTATACTTCTAAATTACAGACAATCTA^ 
GGATATGATTTTCAGACTATTCTAGAAAAATCrCAAGAGAAGATCGATATCGAGATTGCrCCTCAA^ 

ZU CTATCGGACGGATTG AGGAA AATAAGGGTTCTGACCGTGTAGTGGAAGTGATACGATTATTACACCAAGAGGGAA 

aaaactatcatctctattttatcggggctggtgatatggaagaggaactgaaaaaacgagtcaaagagtatggga 
ttgaggactatgtacatttcotggttatcaaaaaaatccttatcag 
atgtctaaacaagaaggttttcctggagtgtatgtggaggccttgagtctgggactcccttttato 
ttggaggggctgaggaattatcccaagaaggacgatttggacaaatcattgagagcaatcaagaggcagctcag 
Id gcgattactaattacatgacttctgcctcaaactttgatgtcgatgaggctagccaattcattcaac^ 
ttacaaaacaaatcgaacaagtagaaaaactattagaggagtag 

mvakkkilffmwsfslgggaekii^tivsnldpekydidilemehfdkgyesvpkhvwucslqdyrqtrwlraflwrm 
Riyfprltrrllvkddydvevsftimnppllfskrrevkkiswihgsieellkdsskreshrsqldaantivgiskktsnsik 
3U evypdytsklqtiyngydfqtileksqekidieiapqsictigrieenkgsdrvvevirllhqegknyhlyfigagdmeeel 
kkrvkeygiedyvhflgyqknpyqylsqtkvllsmskqegfpgvyvealslglpfistdvggaeelsqegrfgqiiesno 
eaaqaitnymtsasnfdvdeasqfiqqftitkqieqveklleez 



35 



1D23 1434 bp 



atggaaactgcattaattagtgtgattgtgccagtctataatgtggcgcagtacctagaaaaatcgatagcttcca 
ttcagaagcagacctatcaaaatcrggaaattattcttgttgatgatggtgcaacagatgaaagtggtcgcttgtg 
tgattcaatcgctgaacaagatgacagggtgtcagtgcttcataaaaagaacgaaggattgtcgcaagcacgaaa 
tgatgggatgaagcaggctcacggggattatctgatttttattgactcagatgattatatccatccagaaatgat^ 
4U cagagcttatatgagcaattagttcaagaagatgcggatgtttcgagctgtggtgtcatgaatgtctatgctaato 
atgaaagcccacagtcagccaatcaggatgactattttgtctgtgattctcaaacatttctaaaggaatacctc^^ 
aggtgaaaaaatacctgggacgatttgcaataagctaatcaagagacagattgcaactgccctatccittcctaa 

GGGGTTGATTTACGAAGATGCCTATTACCATTTTGATTTAATCAAGrrGGCCAAGAAGTATGTGGTTAATA 
CCCTATTATTACTA TTTCC ATAGAGGGGATAGTATTACGACCAAACCCTATGCAGAGAAGGATTTAGCCTATATTG 
43 ATATCTACCAAAAGTTTTATAATGAAGTTGTGAAAAACTATCCTGACTTGAAAGAOT 
CTATGCCCACn-CTTTATTCTGGATAAGATGTTGCTAGATGATCAGTATAA^ 
CATCGTTTTTTAAAAGGCCATGCCTTTGCTATTTCTAGGAATCCAATm 
TGGCCCTATTCATAAATATTTCCTTATATCGATTCTTATTACTGAAAAATATTGAAAAATCT^ 
G 

50 

metalisvivpvynvaqyleksiasiqkqtyqnleiilvddgatdesgrlcdsiaeqddrvsvlhkkneglsqarndgm 

kqahgdylifidsddyihpemiqslyeqlvqedadvsscgvmnvyandespqsanqddyfvcdsqtflkeyligekipg 

ticnklikrqiatalsfpkgliyedayyhfdliklakkyvvntkpyyyyfhrgdsittkpyaekdlayidiyqkfynevv 

knypdlkevaffrlayahffildkmllddqykqfeaysqihrflkghafaisrnpifrkgrrisalalfinislyrflllk 
jd niekskklhz 



ID24 735bp 

atgagaatcaaagaga aaacc aataatattaatggaggaataaaaaatgtaagtaagcattatggtcattcaatc 
attctcaaagatataaattttgcacttaacaagggtgaaattgttggtctagcagggagaaatggagttggtaag 
agtacgttgatgaaaattcttgt tcaga ataatcaaccgacttcaggtaatattataagcagtgataatgttgggt 
atttaatcgaagaaccaaaattatttttatctaaaacaggtttagagaatttaaaatat^ 

TGTTGACTACAATCAAGAAAGATTTAGATGTITGATCCAAGAGTTAGATTrGACTCAGTCTATTAATAAAAAAGTA 

aagacctattctttgggtacaaaacaaaaattagctttgcttctaactctcgttacggaacctgatata^^ 
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TAGATGAACCGACT AATG GTTTAGATATTGAATCATCACAAATAGTTTTAGCGGTTCTAAAAAAATTAGCT^ 
TGAAAATGT GGGAA TTTTAATATCGAGTCATAAATTAGAAGACATTGAAGAAATTTGTGAGAGAGTTCri'i U 'Cl T G 
GAGAACGGGCTTTTGACAprCAAAAAGTAGGAAAAGATAGTCATAATTTOT 
^ CTACAGATAGAGACATTTTCATTACCAAACAAGAATTTTGGGATATTGTTTAG 

MRIKEKTNNINGGIKNVSKHYGHSIILKDINFALNKGEIVGLAGRNGVGKSTLMKILVQNNQPTSGNnSSDNVGYLIEEP 

KLFLSKTGLENLKYLSNLYGVDYNQERFRCLIQELDLTQSINKKVKTYSLGTKQKLALLLTLVTEPDILILDEPTNGLDIE 

SSQIVLAVLKKLALHENVGILISSHKLEDIEEICERVLFLENGLLTFQKVGKDSHNFLFEIAFSSATDRDIFITKQEFWDIV2 

10 ID25 17Q4bp 

ATGACTGAATTAGATAAACGTCACCGCAGTAGCATTTATGACAGCATGGTTAAATCACCTAACCGTGCTATGC^ 
GTGCGACTGGTATGACAGATAAGGACTTTGAAACATCGATTGTGGGAGTGATTTCGACTTGGGCGGAAAATACAC 
CATGTAACATTCACTTGCATGATTTCGGGAAACTGGCTAAAGAAGGTGTCAAATCTGCAGGCGCTTGGCCTGT^^ 

15 AGTTTGGAACCATTACCGTAGCGGACGGGATCGCn'ATGGGAACGCCTGGTATGCGTTTCTCTCTAACATCTCGTGA 
CATCATCGCGGACTCCATCGAGGCGGCTATGAGTGGTCACAACGTGGATGCCTTCGTCGCTATCGGTGGCTGTGA 
CAAGAACATGCCTGGATCTATGATTGCTATTGCTAATATGGATATCCCAGCTATTTTCGCCTATGGTGGAA 
GCACCGGGAAATCTTGATGGTAAAGATATCGACrrGGTTTCTGTCTrrGAAGGTATCGGAAAATGGAACCA^^ 
GACATGACAGCTGAGGACGTGAAACGTCTTGAATGTAATGCCTGCCCTGGCCCTGGTGGTTGTGGTGGTATGTAT 

20 ACTGCTAATACCATGGCAACTGCTATCGAAGTTCTAGGGATGAGTTTGCCAGGGTCATCCTCTCACCCAGCTGAAT 
CAGCTGATAAGAAAGAAGATATCGAAGCAGCAGGACGTGCTGTTGTTAAGATGTTGGAACTTGGTCTCAAACCAT 
CAGATATCTTGACTCGTGAAGCCTTTGAAGATGCTATCACTGTAACGATGGCTCTCGGTGGTTCTACAAACGCCAC 
TCTTCACTTGCTCGCCATTGCCCATGCCGCAAATGTTGACTTGTCACTTGAGGACITCAATACG 
GTGCCTCACTTGGCCGACTTGAAACCATCTGGTCAGTATGTCTTCCAAGACCTCTACGAAGTCGGTGGTGTCCCT^ 

25 CGGTTATGAAGTiVnTGTTGGCAAATGGTTTCCTTCACG 

AAACrrGGCTGACTTTGCAGACTTGACrCCAGGCCAAAAAGTTATCATGCCACTTGAA^ 
TGGTCCGCTTATCATCTTGAACGGGAACCITGCTCCrGACGGTGCAGTrGCCAAGGTATCAGGTGTT 
CGTCACGTTGGGCCAGCTAAGG TCTTT GACTCAGAAGAAGATGCGATTCAGGCCGTTCTGACAGATGAAATCGTT 
GATGGCGATGTAGTCGTTGTTCGTTTTGTTGGACCTAAAGGTGGTCCTGGTATGCCTGAGATGCTATCACm 

30 AATGATTGTTGGTAAAGGTCAGGGAGATAAGGTGGCCCTCTTGACGGACGGACGTTTCTCTGGTGGTACrTATGGT 
CTGGTTGTTGGACATATCGCTCCTGAAGCTCAGGATGGTGGACCAATTGCCTATCTCCGTACCGGCGATATCGTTA 
CGGTTGACCAAGATACCAAAGAAATTTCTATGGCCGTATCCGAAGAAGAACTTGAAAAACGCAAGGCAGAAACA 
ACCTTGCCACCACTTTACAGCCGTGGTGTCCTCGGTAAATATGCCCACATCGTATCATCrGCrrCACGCGGAGC^^ 
TGACAGACTTCTGGAATATGGACAAGTCAGGTAAAAAATAA 

35 

MTELDKRHRSSIYDSMVKSPNRAMLRATGMTDKDFETSIVGVISTWAENTPCNIHLHDFGKLAKEGVKSAGAWPVQFG 
TITVADGIAMGTPGMRFSLTSRDIIADSIEAAMSGHNVDAFVAIGGCDKNMPGSMIAIANMDIPAIFAYGGTIAPGNLDG 
KDIDLVSVFEGIGKWNHGDMTAEDVKRLECNACPGPGGCGGMYTANTMATAIEVLGMSLPGSSSHPAESADKKEDIE 
AAGRAVVKMLELGLKPSDILTREAFEDAITVTMALGGSTNATLHLLAIAHAANVDLSLEDFNTIQERVPHLADLKPSGQ 
40 YVFQDLYEVGGVPAVMKYLLANGFLHGDRITCTGKTVAENLADFADLTPGQKVIMPLENPKRADGPLIILNGNLAPDG 
AVAKVSGVKVRRHVGPAKVFDSEEDAIQAVLTDEIVDGDVWVRFVGPKGGPGMPEMLSLSSMIVGKGQGDKVALLT 
DGRFSGGTYGLVVGHIAPEAQDGGPIAYLRTGDIVTVDQDTKEISMAVSEEELEKRKAETTLPPLYSRGVLGKYAHIVSS 
ASRGAVTDFWNMDKSGKKZ 

45 ID26 274bD 

ATGTTATAATAAAAATAAAGAATTTAAGGAGAAATACAATATGTCAATTTTTATTGGAGGAGCATGGCCATATGC 
AAACGGTTCGTTACATATTGGTCACGCGGCAGCGCTTTTACCGG^ 

GGGAGAGGAAGTTTTATATGTTTCTGGAAGTGATTGTAATGGAACCCCTATTTCTATCAGAGCTAAAA^ 
50 TAAGTCTGTGAAAGAAATTGCTGArrTTTATCATAAGGAATTTAATCCA 

CYNKNKEFKEKYNMSIFIGGAWPYANGSLHIGHAAALLPGDILARYYRQKGEEVLYVSGSDCNGTPISIRAKKENKSVK 
EIADFYHKEFNP 

55 ID28 m^p 

ATGACAACATTATTTTCAAAAATTAAAGAAGTAACAGAACTTGCTGCAGTCTCAGGTCATGAAGCGCCTGTCCGT 

GCTTATCTTCGTGAAAAGTTGACACCGCATGTGGATGAAGTGGTGACAGATGGCTTGGGTGGTATTm 

AACATTCAGAAGCTGTGGATGCACCGCGCGTCTTGGTCGCTTCTCATATGGACGAAGTTGGTTTTATGGTC 

00 AATCAAGCCAGATGGTACCTTCCGTGTCGTAGAAATCGGTGGCTGGAACCCCATGGTGGTTAGCAGCCAACGTTT 
CAAACTCTTGACTCGTGATGGTCATGAAATTCCTGTGATTTCAGGTTCTGTTCCTCCGCATTT^ 
GGGGGACCAACCATGCCAGCCATTGCCGATATCGTTTTTGATGGTGGTTTTGCGGACAAG 
TTTGGCATCCGTCCTGGTGATACCATTGTACCAGATAGTTCTGCAATITTGACAGCCAATGAAAAAAATATC^^ 
CAAAAGCTTGGGATAACCGCTACGGTGTCCTCATGGTAAGCGAGCTAGCTGAAGCTTTATCGGGTCAAAAACTCG 

OD GCAATGAACTCTATCTGGGTTCTAACGTCCAAGAAGAAGTTGGTCTGCGTGGCGCTCATACCTCTACAACCAAGTT 
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TGACCCAGAAGTCTTCCTCGCAGTTGATTGCrCACCAGCAGGTGATGTCTACGGTGGTCAAGGCAAGATTGGAGA 
TGGAACCrrGATTCGTTTCTATGATCCAGGTCACTTGCrrCTCCCAGGGATGAAGGATTTCCT^ 
GAAGAAGCTGGTATCAAGTACCAATACTACTGTGGTAAAGGCGGAACAGATGCAGGTGCAGCTCATCTGAAAAAT 
GGTGGTGTCCCATCAACAACTATCGGTGTCTGCGCTCGTTATATCCATTCTCACCAAACCCTCTATGCAATGGATG 
D ACTTCCTAGAAGCGCAAGCTTTCTTACAAGCCTTGGTGAAGAAATTGGATCGTTCAACGGTT^ 
TTATTAA 

MTTLFSKIKEVTELAAVSGHEAPVRAYLREKLTPHVDEVVTDGLGGIFGIKHSEAVDAPRVLVASHMDEVGFMVSEIKP 
DGTFRVVEIGGWNPMVVSSQRFKLLTRDGHEIPVISGSVPPHLTRGKGGPTMPAIADIVFDGGFADKAEAESFGIRPGDT 
lU IVPDSSAILTANEKNIISKAWDNRYGVLMVSELAEALSGQKLGNELYLGSNVQEEVGLRGAHTSTTKFDPEVFLAVDCS 
PAGDVYGGQGKIGDGTLIRFYDPGHLLLPGMKDFLLTTAEEAGIKYQYYCGKGGTDAGAAHLKNGGVPSTTIGVCARY 
IHSHQTLYAMDDFLEAQAFLQALVKKLDRSTVDLIKHYZ 



15 



40 



60 



W3X n82bp 



ATGGAATTTTCTATGAAATCAGTCAAAGGACTACTCTTTATCATAGCTAGTTTTATC^ 
GAACACTTCTCCCCAATTCATGATTCCAGGACTAGCTTTAACAAGCCTATCTCTGACT^ 
CTCCCACTACTAGAAAGCTGGTTTCACAGTTTGGAGAAGGTCTACACCGTCCACAAATTCACAG 
TCATCCTACTA ATCT TTCATAACTTTAGTATGGGCGGTTTGTGGGGCrCTCGCTTAGCT^ 
2U GCCATCTATATCTTTGCCAGCATCATCCTTGTCGCCTATTTAGGCAAATACATCCAATACGAAGCTTGGC^ 
TTCACCGCCTGGT TTACC TAGCCTATATTTTAGGACTCTTTCACATCTACATGATAATGGGCAATC^ 
TTTAATCTTCTAAGTTTTCTTGTTGGTAGCTATGCCCTTTTAGGCT^ 

CAAAAGATTTCCTTCC CCTA TCTAGGGAAAATTACCCATCTCAAACGCTTAAATCACGATACTAGAGAAAT^^ 
ATCCATCTTAGCAG ACCTTTCAACTATCAATCAGG ACAATTTG^ 
25 GTGCTCCGCATCCCITITCTATCTCAGGAGGTCATGGTCAAACTCTTTACTTTACT^ 

TACCAAGAATATCTATGATAATCTTCAAGCCGGCAGCAAAGTAACCCTAGACAGAGCTTACGGACACATGATCAT 

AGAAGAAGGACGAGAAAATCAGGTTTGGATTGCTGGAGGTATTGGGATCACCCCOT 

ACATCCTATTTTAGATAAACAGGTTCACTTCTACTATAGCTTCCGTGGAGATGAAAATGCAGTCTACCT 

ctccgtaactatgctcagaaaaatcctaattttgaactccatctaatcgacagtacgaaagacggctatot 
JU ttgaacaaaaagaagtgcccgaacatgcaaccgtctatatgtgtggtcctatttctatgatgaaggcacttgcc^ 
aacagattaagaaacaaaatccaaaaacagagcatatttac 

mefsmksvkgllfiiasfiltlltwm^r^spqfmipglaltslslt^latrlplleswfhslekvytvhkf^afl^^ 
nfsmgglwgsrlaaqfgnlaiyifasiilvaylgkyiqyeawrwihrlvylayilglfhiymimgnrlltfnllsflvgs 
j5 yallgllagfyiiflyqkisfpylgkithlkrlnhdtreiqihlsrpfnyqsgqfaflkifqegfesaphpfsisgghgqtly 
ftvktsgdhtkniydnlqagskvtldrayghmiieegrenqvwiaggigitpfisyirehpildkqvhfyysfrgdenav 
yldllrnyaqknpnfelhudstkdgylnfeqkevpehatvymcgpismmkalakquckqnpktehiy 



ID32 900bD 



ATGACTTTTAAATCAGGCTTTGTAGCCATTTTAGGACGTCCCAATGTTGGGAAGTCAACCT^^ 
TGGGGCAAAAGATTGCCATCATGAGTGACAAGGCGCAGACAACGCGCAATAAAATCATGGGAATTTACACGACTG 
ATAAGGAGCAAATTGTCTTTATCGACACACCAGGGATTCACAAGCCTAAAACAGCTCTCGGAGATTTCATGGrr^ 
AGTCTGCCTACAGTACCCTTCGCGAAGTGGACACTGTTCTTTTCATGGTGCCTGCTGATGAAGCGCGTGGTAAGGG 

4D GGACGATATGATTATCGAGCGTCTCAAGGCTGCCAAGGTTCCTGTGATTTTGGTGGTGAATAAAATCGATAAGGTC 
CATCCAGACCAGCTCTTGTCTCAGATTGATGACTTCCGTAATCAAATGGACTTTAAGGAAATTG^ 
CCCTTCAGGGAAATAACGTGTCTCGTCTAGTGGATATTTTGAGTGAAAATCTGGATGAAGGTTTCCAATAT^ 
GTCTGATCAAATCACAGACCATCCAGAACGTTTCTTGGTTTCAGAAATGGTTCGCGAGAAAGTCTTGCACCTA^ 
CGTGAAGAGATTCCGCATTCTGTAGCAGTAGTTGTTGACTCTATGAAACGAGACGAAGAGACAGACAAGGTTCAC 

:)U ATCCGTGCAACCATCATGGTCGAGCGCGATAGCCAAAAAGGGATTATCATCGGTAAAGGTGGCGCTATGCTTAAG 
AAAATCGGTAGCATGGCCCGTCGTGATATCGAACTCATGCTAGGAGACAAGGTCTTCCTAGAAACCTGGGTCAAG 
GTCAAGAAAAACrGGCGCGATAAAAAGCTAGATTTGGCTGACTTTGGCTATAATGAAAGAGAATACTAA 

MTFKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKAQTTRNKIMGIYTTDKEQIVFIDTPGIHKPKTALGDFMVESAYS 
55 TLREVDTVLFMVPADEARGKGDDMIIERLKAAKVPVILVVNKIDKVHPDQLLSQIDDFRNQMDFKEIVPISALQGNNVS 
RLVDILSENLDEGFQYFPSDQITDHPERFLVSEMVREKVLHLTREEIPHSVAVVVDSMKRDEETDKVHIRATIMVERDSQ 
KGIIIGKGGAMLKKIGSMARRDIELMLGDKVFLETWVKVKKNWRDKKLDLADFGYNEREYZ 



ID33 855bD 



CTGCTTCTTGTTTTTACAGAAGGAGGACITATGCCTGAATTACCTGAGGTO 

AATTGATTATAGGAAAGAAGATTTCGAGTATAGAAATTCGCTACCCCAAGATGATTAAGACGGATTTGGAAGAGT 
TTCAAAGGGAATTGCCTAGTCAGATTATCGAGTCAATGGGACGTCGTGGAAAATATTTGCTTTm 
CAAGGTCTTGATTTCCCATTTGCGGATGGAGGGCAAGTATTTTTACTA 
03 GCCCATGTTTTCTTTCATTTTGAAGATGGTGGCACGCTTGTTTATGAGGATG^ 
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TCTTGGTGCCTGACCTTTTAGACGTCrACTT^ 

TTTACAGGTCTTTCAATCTGCCCTTGCCAAGTCCAAAAAGCCTATCAAATCCCATCrCCTAGACCAGA^ 

GCTGGACTTGGCAATATCTATGTGGATGAGGTTCTCTGGCGAGCTCAGGTTCATCCAGCTAGACCTTCCCAGAm 

TGACAGCAGAAGAAGCGACTGCCATTCATGACCAGACCATTGCTGTTTTGGGCCAGGCTGTTGAAAAAGGTGGCT 

CCACCATTCGGACTTATACCAATGCCmCGGGAAGATGGAAGC^TGCAGGACTTTCATCAGGT^ 

CTGGTCAAGAATGTGTACGCTGTGGTACCATCATTGAGAAAATTCAACTAGGCGGACGTGGAACCCACTTTTGTCC 

AAACTGTCAAAGGAGGGACTGA 

MLLVFTEGGLMPELPEVETVCRGLEKLIIGKKISSIEIRYPKMIKTDLEEFQRELPSQIIESMGRRGKYLLFYLTDKVLISHL 
RMEGKYFYYPDQGPERKHAHVFFHFEDGGTLVYEDVRKFGTMELLVPDLLDVYFISKKLGPEPSEQDFDLQVFQSALA 
KSKKPIKSHLLDQTLVAGLGNIYVDEVLWRAQVHPARPSQTLTAEEATAIHDQTIAVLGQAVEKGGSTIRTYTNAFGED 
GSMQDFHQVYDKTGQECVRCGTIIEKIQLGGRGTHFCPNCQRRDZ 

ID34 633bD 



TTGTCCAAACT GTCA AAGGAGGGACTGATGGGAAAAATCATCGGAATCACTGGGGGAATTGCCTCTGGTAAGTCA 
ACTGTGACAAATITTCTAAGACAGCAAGGCTTTCAAGTAGTGGATGCCGACGCAGTCGTCCACCAACTACAGAAA 
CCTGGTGGTCGTCrGTTTGAGGCTCT^\GTACAGCAC^ 

GCCCTCTCCTAGCTAGTCrCATCTrrrCAAATCCTGATGAACGAGAATGGTCTAAGCAAATTCAAGG 
2U CCGTGAGGAACTGGCTACTTTGAGAGAACAGTTGGCTCAGACAGAAGAGATTTTCITCATGGAT^ 

TTTGAGCAGGACTACAGCGATTGGTTTGCTGAGAOTGGTTGGTCTATGTGGACCGAGATGCCCAAGTGGAACGC 
TTAATGAAAAGGGACCAGTTGTCCAAAGATGAAGCTGAGTCTCGTCTGGCAGCCCAGTGGCCTTTAGAAAAAAAG 
AAAGATTTGGCCAGCCAGGTTCTTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATC(^ 
AGGGAGGTAGGCAAGATGACAGAGATTAA 

25 

MSKLSKEGLMGKnGITGGlASGKSTVTNFLRQQGFQVVDADAVVHQLQKPGGRLFEALVQHFGQEIILENGELNRPLL 

ASLIFSNPDEREWSKQIQGEIIREELATLREQLAQTEEIFFMDIPLLFEQDYSDWFAETWLVYVDRDAQVERLMKRDQLS 

KDEAESRLAAQWPLEKKKDLASQVLDNNGNQNQLLNQVHILLEGGRQDDRDZ 

30 iD?g 

TTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCTTCTTGAGGGAGGTAGGCAAGATGACA 
GAG ATTAA CTGGAAGGATAATCTGCGCATTGCCTGGTTTGGTAATTTTCTGACAGGAGCCAGTATTTC^ 
TACCTTTTATGCCCATCTTCGTGGAAAATCTAGGTGTAGGGAGTCAGCAAGTCGCTTT^ 
35 TTCTGTCTCTGCTATTTCCGCGGCGCTCTTTTCTCCTATTTGGGGTATTOT 
TG^O-GATTCGGGCAGGTCTTGCTATGACTATCACTATG 

CTTTmCGTTTACTAAACGGTGTATTTGCAGGTTTTGTTCCTAATGCAACGGCACT^ 
AAGGAGAAATCAGGCTCTGCCTTAGGTACTTTGTCTACAGGCGTAGTTGCAGGTACTCTAACTGGTCCC^^ 
GTGGCnTAT CGCA GAATTATTTGGCATTCGTACAGTTTTCnTACTGGTTGCT^ 
40 TGACTATTTGCTTTATCAAGGAAGATTTTCAACCAGTAGCCAAGGAA^ 
CTCGGT TAAA TATCCCTATCTTTTGCTCAATCTCITTTTAACCAGTT^ 
GCCCTATTTTGGCTCTTTATGTACGCGACTTAGGGCAGACAGAGAATC^ 

AGTATGGGCiiiiCCAGCATGATGAGTGCAGGAGTCATGGGCAAGCTAGGTGACAAGGTGGGCAATCATCGTCTC 
TTGGTTGTCGCCCAGTTTTATTCAGTCATCATCTATCTCCTCTGTGCCAATGCCTCTAGCC^ 
45 CTATCGTTTCCTCTTTGGATTGGGAACCGGTGCCTTGATTCCCGGGGTTAATGCCCTACTCAGCAAAAT^ 
AAAGCCGGCATTTCGAGGGTCTTTGCCrrCAATCAGGTATTCTTTTATCTGGGAGGTC 
GTTCTGCAGTAGCAGGTCAATTTGGCTACCATGCTGTCTTTTATGCGACAAGCCTTTGTG^^ 
TTTAACCTGATTCAATTTCGAACATTATTAAAAGTAAAGGAAATCTAG 

50 MIIMAIRTSFLIKCISFLREVGKMTEINWKDNLRIAWFGNFLTGASISLVVPFMPIFVENLGVGSQQVAFYAGLAISVSAIS 
AALFSPIWGILADKYGRKPMMIRAGLAMTITMGGLAFVPNIYWLIFLRLLNGVFAGFVPNATALIASQVPKEKSGSALG 
TLSTGVVAGTLTGPFIGGFIAELFGIRTVFLLVGSFLFLAAILTICFIKEDFQPVAKEKAIPTKELFTSVKYPYLLLNLFLTS 
FVIQFSAQSIGPILALYVRDLGQTENLLFVSGLIVSSMGFSSMMSAGVMGKLGDKVGNHRLLVVAQFYSVIIYLLCANAS 
SPLQLGLYRFLFGLGTGALIPGVNALLSKMTPKAGISRVFAFNQVFFYLGGVVGPMAGSAVAGQFGYHAVFYATSLCV 

55 AFSCLFNLIQFRTLLKVKEIZ 
ID36 nubp 

ATGGCCCTACCAACTATTGCCATTGTAGGACGTCCCAATGTTGGGAAATCAACCCTATTTAATCGGATCGCT^ 
AG CGAATC TCCATTGTAGAAGATGTCGAAGGAGTGACACGTGACCGTATTTATGCAACGGGTGAGTGGCTCAATC 

OU GTTCTTTTAGCATGATTGATACAGGAGGAATTGATGATGTCGATGCTCCTTTCATGGAACAA^^ 

AGAAATTGCCATGGAAGAAGCAGATGTTATCGTTTTTGTCGTGTCTGGTAAGGAAGGAATTACTGATGCAGACGA 
ATACGTAGCTCGTAAGCTTTATAAGACCCACAAACCAGTTATCCTCGCAGTCAACAAGGTGGACAACCCTGAGAT 
GAGAAATGATATATATGATTTCTATGCTCTCGGTTTGGGTGAACCATTGCCTATCTCATCTGTC 
ACAGGGGATGTGCTAGATGCGATCGTAGAAAATCTTCCAAATGAATATGAGGAAGAAAATCCAGATGTCATTAAG 

05 TTTAGCTTGATTGGTCGTCCTAACGTTGGAAAATCAAGCTTGATCAATGCTATCTTGGGAGAAGACCGTGTTA^^ 
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TTGCCTGTOTATGGCTGTGCrCATGGATAGTTTGACTTGGCTCAATGACCTGATTTACCCT^ 
CAGACCATTCCGACCATTGCCATAGCTCCTATCCTGGTCTTGTGGCTAGGTTATGGGATm 
TGATTATCTTAACGACAACCTTTCCCATCATCGTTAGTATriTGGACGGTTTTAGGC^ 
GACC TTGTTTAGTCTGATGCGGGCCAAGCCrrGGCAAATCCTGTGGCArmAAAATCCCAGTTAGCCTGC 
0 TTTTATGCAGGTCTGAGGGTCAGTGTCTCCTACGCCTTTATCACAACTGTGGTATCTGAGTGGT^^ 
AAGGTCTTGGTG TTTAT ATGATTCAGTCTAAAAAACTGTTTCAGTATGATACCATGTTTGCCAT^^ 
TCGATTATCAGTCTTTTGGGTATGAAGCTGGTCGATATCAGTGAAAAATATGTGATTAAATGGAAA 

MMRNLRSILRRHISLLGFLGVLSIWQLAGFLKLLPKFILPTPLEILQPFVRDREFLWHHSWATLRVALLGLILGVLIACU^ 
lU AVLMDSLTWLNDLIYPMMVVIQTIPTIAIAPILVLWLGYGILPKIVLIILTTTFPIIVSILDGFRHCDKDMLTLFSIJVIRAKP 
WQILWHFKIPVSLPYFYAGLRVSVSYAFITTVVSEWLGGFEGLGVYMIQSKKLFQYDTMFAIIILVSIISLLGMKLVDISE 
KYVIKWKRSZ 



15 



55 



ID42 372bD 



TTGATTTTTAATC CTATT TGCTGTATGATAAGGGAAAAGAAAGGGGACAGAGATATGGCTT^ 
TGCGATCTGCTAGTnTGGTATTGTTACCAGCTTGCCTG^^ 

T TCTTA AAAAATGTCTTTGAATTGGAAGAAGAACTCGAGTTTCAATTGCTTAATAACC 
ACTTTTCAAGTCAACACCTCCCTACAGCCATrGATTTTGAC^ 
20 GTACTGGTTTTAGACATGGACGGTAGAGAAACTATCCTCCTCCCAGAAGAAAATGACCTATTTTAA 

MIFNPICCMIREIOCGDRDMAFr>rrHMRSASFGnrrSLPDDIIDSFWYIIDHFLKNVFELEEELEFQLLNNQGKrrF^ 
HLPTAIDFDFNHPFDPRYPPRVLVLDMDGRETILLPEENDLFZ 

25 ID43 lS69bD 

ACAGCGGT GTCA TTCTATCTATmAAGAAAAGTAATAATCAATTGTTAAAAATAGTAAAAAAATTG 

ATGAA ATATT TTGrrCCTAATGAGGTATTCAGTATTCGTAAATTAAAGGTGGGGACTTGCTCGGTACTATTGGCAA 

TTTCAATTTTGGGAAGCCAAGGTATTTTATCGGATGAAGrTGTTACrAGTTCTTCACCGATGGCTAC 

30 TTCTAATGCAATTACTAATGATTTAGATAATTCACCAACTGTTAATCAGAATCGTTCTGCTGAAATGAT^^ 

AATTCAACCACTAATGGTTTAGATAATTCGTTAAGTGTTAATAGCATCAGCTCTAATGGTACTATTCGTTCCAAT^ 
CACAATTAGACAACAGAACAGTTGAATCTACAGTAACATCTACTAATGAAAATAAGAGTTATAAGGAAGATGTTA 
TAAGTGACAGAATTATCAAAAAAGAATTTGAAGATACTGCTTTAAGTGTAAAAGATTATGGTGCAGTAGGTGATG 
G GATT CATGATGATCGACAAGCAATTCAAGATGCAATAGATGCTGCAGCTCAAGGGCTAGGTGGAGGAAATGTAT 

35 ATTTTCCTGAAGGAACTTATTTAGTAAAAGAAATTGTTTTTTTAAAAAGTC^^ 

AGCrACAATTCTAAATGGTATAAATATTAAGAATCACCCTTCCATTGTTTTTATGACAGGT^ 
GGTGCGCAAGTAGAATGGGGCCCAACAGAAGATATTAGTTATTCTGGTGGTACGATTGATATGAACGGTGCTTTG 
AATGAAGAAGGAACTAAAGCAAAAAATCTACCACTTATAAATTCTTCAGGTGCATTTGCTATTGGGAAT^ 
AACGTAACTATAAAAAATG TAAC ATTCAAGGATAGTTATCAAGGGCATGCTATTCAAATTGCAGGTTCGAAAAAT 

40 GTATTAGTTGATAATTCTCGTTTTCTTGGGCAAGCCTTACCCAAAACGATGAAGGATGGGCAAATCATAAG 

AGAGCATTCAGArTGAACCAT TAACT AGAAAAGGTmCCTTATGCCTTGAATGATGATGGGAAAAAATCTGAAA 
ATGTGACTATTCAAAATTCCTATTTTGGCAAAAGTGATAAATCTGGGGAATTAGTAACAGCAATTGGCACACACT 
TCAAA CATT GTCGACACAGAACCCCTCTAATATTAAAATTCAAAATAATCATTTTGATAACATGATGTATGCAGGT 
GTACGTTTTACAGGATTCACTGATGTATTAATCAAAGGAAATCGCTTTGATAAGAAAGTTAAAGGAGAGAGTGTA 

45 CATTATCGAGAAAGCGGAGCAGCTTTAGTAAATGCTTATAGCTATAAAAACACTAAAGACCTATTAGATT^ 

AAACAGGTGGTTATCGCCGAAAATATATTTAATATTGCCGATCCTAAAACAAAAGCGATACGAGTTGCAAAAGAT 
AGTGCAGAATGTTTAGGAAAAGTATCAGATATTACTGTAACAAAAAATGTAATTAATAATAATTCTAAGGAAACA 
GAACAACCAAATATTGAATTATTACGAGTTAGTGATAATTTAGTAGTCTCAGAGAATAGT 

50 QRCHSIYFKKSNNQLUCIVKKLEVLMKYFVPNEVFSIRKLKVGTCSVLLAISILGSQGILSDEVVTSSSPMATKESSNAITN 
DLDNSPTVNQNRSAEMIASNSTTNGLDNSLSVNSISSNGTIRSNSQLDNRTVESTVTSTNENKSYKEDVISDRIIKKEFEDT 
ALSVKDYGAVGDGIHDDRQAIQDAIDAAAQGLGGGNVYFPEGTYLVKEIVFLKSHTHLELNEKATILNGINIKNHPSIVF 
MTGLFTDDGAQVEWGPTEDISYSGGTIDMNGALNEEGTKAKNLPLINSSGAFAIGNSNNVTIKNVTFKDSYQGHAIQIA 
GSKNVLVDNSRFLGQALPKTMKDGQIISKESIQIEPLTRKGFPYALNDDGKKSENVTIQNSYFGKSDKSGELVTAIGTHY 
QTLSTQNPSNIKIQNNHFDNMMYAGVRFTGFTDVLIKGNRFDKKVKGESVHYRESGAALVNAYSYKNTKDLLDLNKQ 
VVIAENIFNIADPKTKAIRVAKDSAECLGKVSDITVTKNVINNNSKETEQPNIELLRVSDNLVVSENS 

IP44 324bD 

60 GTGATGAAAGAAACTCAGCTATTAAAAGGTGTTCTTGAAGGTTGTGTCTTGGATATGATTGGTCAAAAAGAGCGG 
TATGGTTATGAGTTGGTTCAGACTTTGCGAGAGGCTGGATTTGATACTATCGTTCCAGGAACTAm 
GCAAAAGTTAGAAAAAAATCAATGGATAAGAGGCGACATGCGCCCGTCGCCAGATGGTCCAGATCGGAAGTATTT 
TTCATTAATGAAAGAAGGAGAAGAGCGTGTCTCAGTCTTTTGGCAACAATGGGACGATTTGAGTCAAAAAGTAGA 
AGGGATTAAGAATGGGGGTTAA 

65 
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CTAGTCCTGTTGCTGGAACAACTCGTGATGCCATTGATACCCACTTTACAGATACAGATGGTCAAGAGm 
GATTGATACGGCTGGTATGCGTAAGTCTGGTAAGGTTTATGAAAATACTGAGAAATACTCTGTTATGCGTGCCATG 
CGTGCTATTGACCGTTCAGATGTGGTCTTGATGGTCATCAATGCGGAAGAAGGCATTCGTGAGTACGACAAGCGT 
ATCGCAGGATTTGCCCATGAAGCTGGTAAAGGGATGATTATCGTGGTCAACAAGTGGGATACGCTTGAAAAAGAT 
D AACCACA CTAT GAAAAACTGGGAAGAAGATATCCGTGAGCAGTTCCAATACCTGCCTTACGCACCGATTATCTTT 
GTATCAGCTITAACCAAGCAACGTCTCCACAAACTTCCTGAGATGATTAAGCAAATCAGCGAAAGTCAAAATAC^ 
CGTATTCCATCAGCTGTCTTGAACGATGTCATCATGGATGCCATTGCCATCAACCCAACACCGACAGACAAAGGA 
AAACGTCTCAAGATTTTCTATGCGACCCAAGTGGCAACCAAACCACCAACCTTTGTCATCTTTGT^ 

in AACrCATGCACTTTTCTTACCTGCGTTTOTGGAAAATCAAATCCGCAAGGCCm 

1 U TCATCTCATCGCA AG AAAACGCA AATAA 

MALPTIAIVGRPNVGKSTLFNRIAGERISIVEDVEGVTRDRIYATGEWLNRSFSMIDTGGIDDVDAPFMEQIKHQAEIAM 
EEADVrVFVVSGKEGlTDADEYVARKLYKTHKPVILAVNKVDNPEMRNDIYDFYALGLGEPLPISSVHGIGTGDVLDAI 
VENLPNEYEEEWDVIKFSLIGIU»IWGKSSLINAILGEDRVlASPVAGTnU>AIDTHFrDTDGQEFrMIDTAGM 
YEhTTEKYSVMRAMRAIDRSDVVLMVINAEEGIREYDKRIAGFAHEAGKGMnVVNKWDTLEKDNHmKNWEEDIREQ 
FQYLPYAPIIFVSALTKQRLHKLPEMIKQISESQNTRIPSAVLNDVIMDAIAINPTPTDKGKRLKIFYATQVATKPPTFVIFV 
NEEELMHFSYLRFLENQIRKAFVFEGTPIHLIARKRKZ 



15 



20 



50 



60 



ID37 714bD 



ATGACAGAAACC ATTAA ATTGATGAAGGCTCATACTTCAGTGCGCAGGTTTAAAGAGCAAGAAATTCCCCAAGTA 
GACTTAAATGAGATTTTGACAGCAGCCCAGATGGCATCATCrrGGAAGAATTTCCAATCCTACT 

TCTCITTGTCGGAGATTTGAACCGAGCAGAAAAGGGAGCCCGACTTCATACCGACACCTTCCAACCCC^ 
ID GGAAGGTCTCTTGATTAGTTCGGTCGATGCAGCTCTTGCTGGACAAAACGCCTTGTTGGCAGCT^ 

TATGGTGGTGTGATTATCGGTTTGGTTCGATACAAGTCTGAAGAAGTGGCAGAGCTCmAACCTACCTGACTACA 
CCTATTCTGTCTTTGGGATGGCACTGGGTGTGCCAAATCAACATCATGATATGAAACCGAGACTGCCACTAGAGA 
ATGTTGTCTTTGAGGAAGAATACCAAGAACAGTCAACTGAGGCAATCCAAGCTTATGACCGTGTTCAGGCTGACr 
ATGCTGGGGCGCGTGCGACCACAAGCTGGAGTCAGCGCCTAGCAGAACAGTTTGGTCAAGCTGAACCAAGCTCAA 
5U CTAGAAAAAATCTTGAACAGAAGAAATTATTGTAG 

MTETIKLMKAHTSVRRFKEQEIPQVDLNEILTAAQMASSWKNFQSYSVrvVRSQEKKDALYELVPQEAIRQSAVFLLFV 

GDLNRAEKGARLHTDTFQPQGVEGLLISSVDAALAGQNALLAAESLGYGGVIIGLVRYKSEEVAELFNLPDYTYSVFG 

MALGVPNQHHDMKPRLPLENVVFEEEYQEQSTEAIQAYDRVQADYAGARATTSWSQRLAEQFGQAEPSSTRKNLEQK 
J J K LLZ 

tD38 729bo 

ATGACAGAAATTAGACTAGAGCACGTCAGTTATGCCTATGGTCAGGAGAGGATTTTAGAGGATATCAACCTAC^^ 
4U GTG ACTT CAGGCGAAGTGGTTTCCATCCTAGGCCCAAGTGGTGTTGGAAAGACCACCCTCTTTAATCTAA^ 
GGATTTTAGAAGTTCAGTCAGGGAGAATTGTCCITGATGGTGAAGAAAATCCCAAGGGGCGCGTGAGTT^ 
TGCAAAAGGATCTGCTCTTGGAGCACAAGACGGTGCnTGGAAATATCATTCTGCCCCTCTTGATTCAAAAGG 
ATAAGGCAGAAGCTATTTCCCGAGCGGATAAAATTCTTGCGACCTTCCAGCTGACAGCTGTAAGAGACAAGTATC 
CTCATGAACTTAGCGGTGGGATGCGCCAGCGTGTAGCCTTACTCCGGACCTACCTTTT^ 
OTAGATGAGGCaTTAGCGCCTTGGATGAGATGACAAAGATGGAACTCCACGCTTGGTATCTTG 
GCAGTTGCAGCTAACAACCCTGATCATCACGCATAGTATTGAGGAGGCCCTCAATCTCAGCGACCGTATCTATATC 
TTGAAAAATCGCCCTGGGCAGATTGTTTCAGAAATTAAACTAGATTGGTCTGAAGATGAGGACAAGGAAGTCC^^ 
AAGATTGCCTACAAACGTCAAATTTTGGCGGAATTAGGCTTAGATAAGTAG 



MTEIRLEHVSYAYGQERILEDINLQVTSGEVVSILGPSGVGKTTLFNLL^GILEVQSGRIVLDGEENPKGRVSYMLQKDLL 
LEHKTVLGNIILPLLIQKVDKAEAISRADKILATFQLTAVRDKYPHELSGGMRQRVALLRTYLFGHKLFLLDEAFSALDE 
MTKMELHAWYLEIHKQLQLTTLIITHSIEEALNLSDRIYILKNRPGQIVSEIKLDWSEDEDKEVQKIAYKRQILAELGLDK 



55 ID?9 243?lyp 



ATGAACTATTCAAAAGCATTGAATGAATGTATCGAAAGTGCCTACATGGTTGCTGGACATTTTGGAGCT^ 
TAGAGTCGTGGCACTTGTTGATTGCCATGTCTAATCACAGTTATAGTGTAGCAGGGGCAACrrrAAATGATTATCC 
GTATGAGATGGACCGTITAGAAGAGGTGGCTTTGGAACTGACTGAAACGGACTATAGCCAGGATGAAACCTT^ 
GGAATTGCCGTTCTCCCGTCGTTTGCAGGTTCTTTTTGATGAAGCAGAGTATGTAGCGTCAGTC 
GTACTAGGGACAGAGCACGTCCTCTATGCGATTTTGCATGAT^^ 
GCTGGi l r rii^ ri ATGAAGACAAGAAAGATCAGGTCAAGATTGCTGCTCTTCGTCGAAATTTAGAAGAACGGGCA 
GGCTGGACTCGTGAAGATCTCAAGGCTTTACGCCAACGCCATCGTACAGTAGCTGACAAGCAAAATTCTATGGCC 
AATATGATGGGCATGCCGCAGACTCCTAGTGGTGGTCTCGAGGATTATACGCATGATTTGACAGAGCAAGCGCGT 
TCTGGCAAGTTAGAACCAGTCATCGGTCGGGACAAGGAAATCTCACGTATGATTCAAATCTTGAGCCGGAAGACT 
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AAGAACAACCCTGTCTTGGrrGGGGATGCTGGTGTCGGGAAAACAGCTCTGGCGCTTGGTCTTGCCCAGCGTATO 

CTAGTGGTGACGTGCCTGCGGAAATGGCTAAGATGCGCGTGTTAGAACTTGATTTGATGAATGTCGTTGCAGGGA 

CACGCTTCCGTGGTGACTTTGAAGAACGCATGAATAATATCATCAAGGATATTGAAGAAGATGGCCAAGTCATCC 

TCTITATCGATGAACTCCACACCATCATGGGTTCTGGTAGCGGGATTGATTCGACTCTGGATGCGGCCAAT^^ 

GAAACCAGCCTTGGCGCGTGGAACTTTGAGAACGGTTGGTGCCACTACTCAGGAAGAATATCAAAAACATATCG^ 

AAAAGATGCGGCACTTTCTCGTCGTTTCGCTAAAGTGACGATTGAAGAACCAAGTGTGGCAGATAGTATGACTAT 

TTTACAAGGTTTGAAGGCGACITATGAGAAACATCACCGTGTACAAATCACAGATGAAGCGGTTGAAACAGCGGT 

TAAGATGGCTCATCGTTATTrAACCAGTCGTCACTTGCCAGACTCTGCTATCGATCrC^ 

ACAGTGCAAAATAAGGCAAAGCATGTAAAAGCAGACGATTCAGATTTGAGTCCAGCTGACAAGGCCCTGATGGAT 

GGCAAGT GGAAA CAGGCAGCCCAGCTAATCGCAAAAGAAGAGGAAGTACCTGTCTACAAAGACTTGGTGACAGA 

GTCTGATATTTTGACCACCTTGAGTCGCTTGTCAGGAATCCCAGTTCAAAAACTGACTCAAACGGATGCTA^ 

TATTTAAATCTTGAAGCAGAACrCCATAAACGGGTTATCGGTCAAGATCAAGCTGTrrCAAGCATTAGCOT^ 

TTCGCCGCAACCAGTCAGGGATTCGCAGTCATAAGCGTCCGATTGGTTCCTITATGTTCCTAGGGCCTACAGGTG^ 

CGGGAAAACTGAATTAGCCAAGGCTCTGGCAGAAGTTCTTTITGACGACGAATCAGCCCTTATCCGC^ 

AGTGAGTATATGGAGAAATTTGCAGCTAGTCGTCTCAACGGAGCTCCTCCAGGCTATGTAGGATATGAAGAAGGT 

GGGGAGTTGACAGAGAAGGTTCGCAATAAACCCTATTCCGrrCTCCTCTTTGATGAGGTAGAGAAGGCCCACCCA 

GATATCTITAATGTTCTCTTGCAGGTTCTGGATGACGGTGTCrTGACAGATAGCAAGGGACGCAAGGTCGAT^^ 

CAAATACCATTA TCAT TATGACATCGAATCTAGGTGCGACTGCCCTTCGTGATGATAAGACTGTTGGTm 

TAAGGATATTCGTTTTGACCAGGAAAATATGGAAAAACGCATGriTGAAGAACTGAAAAAAGCTTATAGACCGGA 

ATTCATCAACCGTATTGATGAGAAGGTGGTCTTCCATAGCCTATCTAGTGATCATATGCAGGAAGTGGTGAAGArr 

ATGGTCAAGCCTTTAGTGGCAAGTTTGACTGAAAAAGGCATTGACTTGAAATTACAAGOT 

TAGCAAATCAAGGATATGACCCAGAGATGGGAGCTCGCCCACTTCGCAGAACCCTGCAAACAGAAGTGGAGGAC 

AAGTTGGCAGAACirCTTCTCAAGGGAGATTTAGTGGCAGGCAGCACACTTAAGATTGGTGTCAA^^ 

TTAAAATTTGATATTGCATAA 

MNYSKALNECIESAYMVAGHFGARYLESWHLLIAMSNHSYSVAGATLNDYPYEMDRLEEVALELTETDYSQDETFTE 

LPFSRRLQVLFDEAEYVASVVHAKVLGTEHVLYAILHDSNAIJ^TRILERAGFSYEDKKDQVKIAALRRNLEERAGOT^ 

EDLKALRQRHRTVADKQNSMANMMGMPQTPSGGLEDYTHDLTEQARSGKLEPVIGRDKEISRMIQILSRKTKNNPVLV 

GDAGVGKTALALGLAQRIASGDVPAEMAKMRVLELDLMNVVAGTRFRGDFEERMNNIIKDIEEDGQVILFIDELHTIM 

GSGSGIDSTLDAANILKPALARGTLRTVGATTQEEYQKHIEKDAALSRRFAKVTIEEPSVADSMTILQGLKATYEKHHRV 

QITDEAVETAVKMAHRYLTSRHLPDSAIDLUDEAAATVQ^aCAKHVKADDSDLSPADKALMDGKWKQAAQLlAKEEEV 

PVYKDLVTESDILTTLSRLSGIPVQKLTQTDAKKYLNLEAELHKRVIGQDQAVSSISRAIRRNQSGIRSHKRPIGSFMFLGP 

TGVGKTELAKALAEVLFDDESALIRFDMSEYMEKFAASRLNGAPPGYVGYEEGGELTEKVRNKPYSVLLFDEVEKAHP 

DIFNVLLQVLDDGVLTDSKGRKVDFSNTIIIMTSNLGATALRDDKTVGFGAKDIRFDQENMEKRMFEELKKAYRPEFIN 

RIDEKVVFHSLSSDHMQEVVKIMVKPLVASLTEKGIDLKLQASALKLLANQGYDPEMGARPLRRTLQTEVEDKLAELL 

LKGDLVAGSTUCIGVKAGQLKFDIAZ 

ID4P lOOSbp 

ATGAAGAAAACATGGAAAGTGTITITAACGCrrGTAACAGCTCTTGTAGCTGTTGTGOT 

GAACTGCTTCTAAAGACAACAAAGAGGCAGAACTTAAGAAGGTTGACmATCCTAGACTGGAC^^ 

ACCACACAGGGCTTTATGTTGCCAAGGAAAAAGGTTATTTCAAAGAAGCTGGAGTGGATGTTGATTT 

CACCAGAAGAAAGTTCrrCTGACrrGGTTATCAACGGAAAGGCACCATTTGCAGTGTATTTCCAAGACT^ 

TAAGAAATrGGAAAAAGGAGCAGGAATCACTGCCGTrGCAGCTATTGTTGAACACAATACATCAGGAATCATCTC 

TCGTAAATCTGATAATGTAAGCAGTCCAAAAGACTTGGTTGGTAAGAAATATGGGACATGGAATGACCCAACTGA 

ACTTGCTATGTTGAAAACCITGGTAGAATCTCAAGGTGGAGACTTrGAGAAGGTrGAAAAAGTACCAAAT^ 

CTCAAACTCAATCACACCGATTGCCAATGGCGTCTTTGATACTGCTTGGATTTACT^ 

GCrAAATCTCAAGGTGTAGATGCTAACTTCATGTACTTGAAAGACTATGTCAAGGAGTTTGACTACTA 

TTATCATCGCAAACAACGACTATCTGAAAGATAACAAAGAAGAAGCTCGCAAAGTCATCCAAGCCATCAAAAAA 

GGCTACCAATATGCCATGGAACATCCAGAAGAAGCTGCAGATATTCTCATCAAGAATGCACCTGAACTCAAGGAA 

AAACGTGACTTTGTCATCGAATCTCAAAAATACTTGTCAAAAGAATACGCAAGCGACAAGGAAAAATGGGGTCAA 

TTTGACGCAGCTCGCTGGAATGCriTTCTACAAATGGGATAAAGAAAATGGTATCCTTAAAGAAGAOT 

AAAGGCTTCACCAACGAATTTGTGAAATAA 

MKKTWKVFLTLVTALVAVVLVACGQGTASKDNKEAELKKVDFILDWTPNTNHTGLYVAKEKGYFKEAGVDVDLKLP 

PEESSSDLVINGKAPFAVYFQDYMAKKLEKGAGITAVAAIVEHNTSGIISRKSDNVSSPKDLVGKKYGTWNDPTELAML 

KTLVESQGGDFEKVEKVPNNDSNSITPIANGVFDTAWIYYGWDGILAKSQGVDANFMYLKDYVKEFDYYSPVIIANND 

YLKDNKEEARKVIQAIKKGYQYAMEHPEEAADILIKNAPELKEKRDFVIESQKYLSKEYASDKEKWGOFDAARWNAFY 
KWDKENGILKEDLTDKGFTNEFVKZ 

ID41 762bD 

TTGATGAGAA/£lTGAGAAGTATACrGAGACGACACAT^^ 

AGTTAGCAGGTTTTCTTAAACTTCTCCCCAAGTTTATCCTGCCGACACCTCnTGAAATTCT 
GACAGAGAATTTCTCTGGCACCATAGCTGGGCGACCTTGAGAGTGGCITrACrGGGGCTGAT^ 
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MMKETQLLKGVLEGCVLDMIGQKERYGYELVQTLREAGFDTIVPGTIYPLLQKLEKNQWIRGDMRPSPDGPDRKYFSL 
MKEGEERVSVFWQQWDDLSQKVEGIKNGGZ 

^ [D45 8l6bD 

ATGAAGAAAATGAAGTATTACGAAGAAACAAGCGCTTTGCT^ 

GAGGAGTTGTGGGAAAGTTTTAATCrrGCTGGATTTCTCTATGATGAAGACTATCTCAGAGAGCAGATCTAm 

TGATGCTAGATTTCTCAGAAGCAGAACGAGATGGCATGAGTGCAGAGGATTATCTAGGTAAGAATCCTAAAAAAA 

TAATGAAAGAGATTCTCAAGGGAGCACCTCGCAGTTCTATCAAAGAGTCCCTTTTGACGCCAATTCTTGTCCT^ 

lU GGTATTACGTTA TTATC AACTACTAAGTGATTTTTCTAAAGGTCCTCTCTTAACAGTCAATTTGCT 
GGCAACTTCTTATTTTTCTGATTGGATTTGGACTTGTGGCCACAATTT^ 
AAAATGAAAATTGGCACrT ACATT GTTGTTGGGACTATAGTTCTTCTAGTTGTTTTAGGAT^^ 
GCTTCATACAAGAAGGAGCCTTTTATATTCCGGCTCCCTGGGATAGTTTGTCTGTCTTTACGA 
GGTATTTGGAATTGGAAAGAAGCGGTCTTTCGTCCATrrGTCAGTATGATTATTGCCCATCTTGTGGTGGGT^ 

O GCTCCGTTATTATGAGTGGATGGGAATTTCAAATGTTTTCC^ 

GAATCTTTGTCTTGTTCCGTGGGTTTAAGAAGATAAAATGGAGTGAAGTATAG 

MKKMKYYEETSALLHEFSEENQKYFEELWESFNLAGFLYDEDYLREQIYLMMLDFSEAERDGMSAEDYLGKNPKKIM 
KEILKGAPRSSIKESLLTPILVLAVLRYYQLLSDFSKGPLLTVNLLTFLGQLLIFLIGFGLVATILRRSLVQDSPKMKIGTYI 
2U VVGTIVLLVVLGYVGMASFIQEGAFYIPAPWDSLSVFTISLVIGIWNWKEAVFRPFVSMIIAHLVVGSLLRYYEWMGISN 
VFLTKVIPLAVLFIGIFVLFRGFKKIKWSEV2 

JD46 348bp 

25 CTGi 1 1 1 1 1 lArrTATACTCAATGAAAATCAAAGAGCAAACTAGGAAGCTAGCCGCAGGTTGCTCAAAACACTGTT 

TTGAGGTTGTAGA CGAA ACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCA 
GCTCAAAACACTGTTITGAGGTTGTAGATGAAACTGACGAAGTCAGCTCAAAACACrGTTTTGAGGT^ 
AAACTGACGAAGTCAGCTCAAAACATGTTTTTGAGGTTGTAGATGAAACTGACGAAGTCAGTAACCAT^^ 
GTAGGGCGACGCTGACGTGGTTTGAAGAGATTTTCGAAGAGTATTAA 



30 



35 



55 



60 



MFFYLYSMKIKEQTRKLAAGCSKHCFEVVDETDEVSSKHVFEVVDETDEVSSKHCFEVVDETDEVSSKHCFEVVDETD 
EVSSKHVFEVVDETDEVSNHTYGRATLTWFEEIFEEYZ 

1047 1260bD 



ATGCAGAATCTGAAATTTGCCTTTTCATCTATCATGGCTCACAAGATGCGTTCTTTGa^ 

TATCGGTGTTTCATCAGTTGTTGTGATTATGGCm 

AATCTCAGAAAAATATTAGCGTCTTTTTCT 

TTrrACGGTTTCTGGAAAGGAAGAGGAAGTTCCTGTTGAACCGCCAAAACCGCAAGAATCCTGGGTCCAAGAG^ 
4U AGCTAAACTGAAGGGAGTGGATAGTTACTATGTAACCAATTCAACGAATGCCATCrrGACCTATCAAG 

GGTTGAGAATGCTAATTTGACAGGTGGAAACAGAACnTACATGGACGCTGTTAAGAATGAAATTATTGCAGGTCG 
TAGTCTGAGAGAGCAAGATTTCAAAGAGTTTGCAAGTGTCATTTrGCTAGATGAGGAATTGTCCATTAGm 
GAATCTCCTCAAGAGGCTATTAACAAGGTTGTAGAAGTCAATGGATTTAGTTACCGGGTCATTGGGGTTTATACTA 
GTCCGGAGGCTAAAAGATCAAAAATATATGGGTTTGGTGGCTTGCCTATTACTACCAATATCTCCCTTGCT^ 
4^ TTTTAATGTAGATGAAATAGCTAATArrGTCTTTCGAGTGAATGATACCAGTTTAACCCCAACTCTGG 

CTGGCACGAAAAATGACAGAGC TTGCA GGCTTACAACAGGGAGAATACCAGGTGGCAGATGAGTCCGTTGTATTT 
GCAGAAATTCAACAATCGTTTAGTTTTATGACGACGATTArrAGTTCCATCGCAGGGATTTCTCTCm 
GAACTGGTGTCATGAAC ATCA TGCTGGTTTCGGTGACAGAGCGCACTCGTGAGATTGGTCTTCGTAAGGCTTTGGG 
TGCAACACGTGCCAATATTTTAATTCAGTTTTTGATTGAATCCATGATTTTGACCT^ 

TGACAATTGCAAGTGGTTTAACTGCCTTAGCAGGTTTGTTACTGCAAGGTTTAATAGAAGGTATAGAAG^^ 

ATCAATCCCAGTCGCCCTATTTAGTCTTGCAGTTTCGGCTAGTGTTGGTATGATTITTGGAGTC^ 

AAGGCATCGAAACTTGATCCAATTGAAGCCCTTCGTTATGAATGA 



MQNLKFAFSSIMAHKMRSLLTMIGIIIGVSSVVVIMALGDSLSRQVNKDMTKSQKNISVFFSPKKSKDGSFTQKQSAFTVS 
GKEEEVPVEPPKPQESWVQEAAKLKGVDSYYVTNSTNAILTYQDKKVENANLTGGNRTYMDAVKNEIIAGRSLREQDF 
KEFASVILLDEELSISLFESPQEAINKVVEVNGFSYRVIGVYTSPEAKRSKIYGFGGLPITTNISLAANFNVDEIANIVFRVN 
DTSLTPTLGPELARKMTELAGLQQGEYQVADESVVFAEIQQSFSFMTTnSSIAGlSLFVGGTGVMNIMLVSVTERTREIG 
LRKALGATRANILIQFUESMILTLLGGLIGLTIASGLTALAGLLLQGLIEGIEVGVSIPVALFSLAVSASVGMIFGVLPANK 
ASKLDPIEALRYEZ 

ID48 70gbp 



CTGATGAAGCAACTAATTAGTCTAAAAAATATCTTCAGAAGTTACCGTAATGGTGACCAAGAACTGCAGGTTCTC 
AAAAATATCAATCTAGAAGTGAATGAGGGTGAATTTGTAGCCATCATGGGACCATCTGGGTCTGGTAAGTCCACT 
0^ CTGATGAATACGATTGGCATGTTGGATACACCAACCAGTGGAGAATATTATCTTGAAGGTCAAGAAGTGGCTGGG 
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CTTGGTGAAAAACAACTAGCTAAGGTCCGTAACCAACAAATCGGTTTTGTCTTTCAGCAGTTC^ 

AGCTCAATGCTCTGCAAAATGTAGAATTGCCCTTGATTTACGCAGGAGTTTCGTCTTCAAAACGTCGC^ 

TGAGGAATATITAGACAAGGTTGAATTGACAGAACGTAGTCACCATTTACCTTCAGAATTATCTGGTGGTCAAA^ 

GCAACGTGTAGCCATTGCGCGTGCCTTGGTAAACAATCCrrCTATTATCCTAGCGGATGAACCGACAGGAGCCTTG 

GATACCAAAACAGGTAACCAAATTATGCAATTATTGGTTGATTTGAATAAAGAAGGAAAAACCArrATCATGGTA 

ACGCATGAGCCTGAGATTGCTGCCTATGCCAAACGTCAGATTGTCATTCGGGATGGGGTCATTTCGTCTGACAGTG 

CTCAGTTAGGAAAGGAGGAAAACTAA 

MMKQLISLKNIFRSYRNGDOELQVLKNINLEVNEGEFVAIMGPSGSGKSTLMNTIGMLDTPTSGEYYLEGQEVAGLGEK 
QLAKVRNQQIGFVFQQFFLLSKLNALQNVELPLiYAGVSSSKRRKLAEEYLDKVELTERSHHLPSELSGGQKQRVAIARA 
LVNNPSIILADEPTGALDTKTGNQIMQLLVDLNKEGKTIIMVTHEPEIAAYAKRQIVIRDGVISSDSAQLGKEENZ 

ID49 1200bD 

ATGAAGAAAAAG AATGG TA AAGC TAAAAAGTGGCAACTGTATGCAGCAATCGGTGCTGCGAGTGTAGTTGTATTG 
GGTGCTGGGGGGATTTTACTCTTTAGACAACCTTCTCAGACTGCTCTA^ 

CCAAGGAAGGAAGCGTGGCCTCCTCTGTTTTATTGTCAGGGACAGTAACAGCAAAAAATGAACAATATGm 

TTGATGCTAGTAAGGGTGATITAGATGAAATCCTrGTTTCTGTGGGCGATAAGGTCAGCGAAGGGCAGGCn^ 

CAAGTACAGTAGTTCAGAAGCGCAGGCGGCCTATGATTCAGCTAGTCGAGCAGTAGCTAGGGCAGATCGTCATAT 

CAATGAACTCAATCAAGCACGAAATGAAGCCGCTTCAGCTCCGGCTCCACAGTTACCAGCGCCAGTAGGAGGAGA 

AGATGCAACGGTGCAAAGCCCAACTCCAGTGGCTGGAAATTCTGTTGCTTCTATTGACGCTCAATTGGGTGATGCC 

CGTGATGCGCGTGCAGATGCTGCGGCGCAATTAAGCAAGGCTCAAAGTCAATTGGATGCAACAACTGTTCTCAGT 

ACCCTAGAGGGAACTGTGGTCGAAGTCAATAGCAATGTTTCTAAATCTCCAACAGGGGCGAGTCAAGTTATGGTT 

CATATTGTCA GCAAT GAAAATTTACAAGTCAAGGGAGAATTGTCTGAGTACAATCTAGCCAACCmCTGTAGGTC 

AAGAAGTAAGCTTTACTTCTAAAGTGTATCCTGATAAAAAATGGACTGGGAAATTAAGCTATAm 

TAAAAACAATGGTGAAGCAGCTAGTCCAGCAGCCGGGAATAATACAGGTTCTAAATACCCTTATACTATTGATGT 

GACAGGCGAGGTTGGTGATTTGAAACAAGGTTTTTCTGTCAACATTGAGGTTAAAAGCAAAACTAAGGCTAT^ 

GTTCCTGTTAGCAGTCTAGTAATGGATGATAGTAAAAATTATGTCTGGATTGTGGATGAACAACAAAAGGCTAAA 

AAAGTTGAGGTTTCATTGGGAAATGCTGACGCAGAAAATCAAGAAATCACrrCTGGTTTy^^ 

GTCATCAGTAATCCAACATCTTCCTTGGAAGAAGGAAAAGAGGTGAAGGCTGATGAAGCAACTAATTAG 

MKKKNGKAKKWQLYAAIGAASVVVLGAGGILLFRQPSQTALKDEPTHLVVAKEGSVASSVLl^GTVTAKNEQYVYFD 

ASKGDLDEILVSVGDKVSEGQALVKYSSSEAQAAYDSASRAVARADRHINELNQARNEAASAPAPQLPAPVGGEDATV 

QSPTPVAGNSVASIDAQLGDARDARADAAAQLSKAQSQLDATTVLSTLEGTVVEVNSNVSKSPTGASQVMVHIVSNEN 

LQVKGELSEYNLANl^VGQEVSFTSKVYPDKKWTGKLSYISDYPKNNGEAASPAAGNNTGSKYPYTIDVTGEVGDLKQ 

GFSVNIEVKSKTKAILVPVSSLVMDDSKNYVWlVDEQQKAKKVEVSLGNADAENQEirSGLTNGAKVISNPTSSLEEGKE 

VKADEATN2 

IDS07S9bD 

ATGTCACGTAAACCATTTATCGCTGGTAACTGGAAAATGAACAAAAATCCAGAAGAAGCTAAAGCATTCGTTGA^ 

GCAGTTGCATCAAAACTTCCTTCATCAGATCTTGTTGAAGCAGGTATCGCTGCTCCAGCTCTTGA 

TTCTTGCTGTTGCAAAAGGC TCAAA CCITAAAGTTGCTGCTCAAAACTGCrACm 

TGGTGAAACTAGCCCACAAGTTTTGAAAGAAATCGGTACTGACTACGTTGTTATCGGTCACTCAGAA^^ 

CTACTTCCATGAAACTGATGAAGATATCAACAAAAAAGCAAAAGCAATCTITGCGAACGGTATGCTTCCAATCAT 

CTGTTGTGGTGAATCAOTGAAACTTACGAAGCTGGTAAAGCTGCTGAATTCGTAGGTGCrCAAGTATCT^ 

TTGGCTGGATTGACTGCTGAACAAGTTGCTGCCTCAGTTATCGCTTATGAGCCAATCTGGGCTATC^ 

AATCAGCTTCACAAGACGATGCACAAAAAATGTGTAAAGTTGTTCGTGACGTTGTAGCTGCTGACTTTGGTC^ 

AAGTCGCAGACAAAGTTCGTGTTCAATACGGTGGTTCTGTTAAACCTGAAAATGTTGCTTCATACATGGCTTGCCC 

AGACGTTGACGGTGCCCTTGTAGGTGGTGCGTCACTTGAAGCTGAAAGCTTCTTGGCTTTGOT 

TAA 

MSRKPFIAGNWKMNKNPEEAKAFVEAVASKLPSSDLVEAGIAAPALDLTTVLAVAKGSNLKVAAQNCYFENAGAFTG 
ETSPQVLKEIGTDYVVIGHSERRDYFHETDEDINKKAKAIFANGMLPIICCGESLETYEAGKAAEFVGAQVSAALAGLTA 
EQVAASVIAYEPIWAIGTGKSASQDDAQKMCKVVRDVVAADFGQEVADKVRVQYGGSVKPENVASYMACPDVDGAL 
VGGASLEAESFLALLDFVKZ 

1D51 14731?p 

TTGAAAACAAAAATTGGATTAGCAAGTATCTGTTTACTAGGCTTGGCAACTAGTCATGTCGCTGCAAATG 

AAGTAGCAAAAACTTCGCAGGATACAACGACAGCTTCAAGTAGTTCAGAGCAAAATCAGTCrrCTAATAAAACGC 

AAACGAGCGCAGAAGTA CAGACT AATGCTGCTGCCCACTGGGATGGGGATTATTATGTAAAGGATGATGGTTCTA 

AAGCTCAAAGTGAATGGATTTTTGACAACTACTATAAGGCTTGGTTTTATATTAATTCAGATGGTC^ 

GAATGAATGGCA TGGAA ATTACTACCTGAAATCAGGTGGATATATGGCCCAAAACGAGTGGATCTATGACAGTAA 

TTACAAGAGTTGGTTTTATCTCAAGTCAGATGGGGCTTATGCTCATCAAGAATGGCAATTGATTGGAAATAAGTC 
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TACTACTTCAAGAAGTGGGGTTACATGGCTAAAAGCCAATGGCAAGGAAGTTATTTCTTGAATGGTCAAGGAGCT 
ATGATGCAAAATGAATGGCTCTATGATCCAGCCTATTCTGCTTATTTTTATCTAAAATCCGATGGAAOT 
ACCAAGAGTGGCAAAAAGTGGGCGGCAAATGGTACTATTTCAAGAAGTGGGGCTATATGGCTCGGAATGAGTGGC 
AAGGCAACTACTATTTGACTGGAAGTGGTGCCATGGCGACTGACGAAGTGATTATGGATGGTACTCGCTATATCTT 
J TGCGGCCTCTGGTGAGCTCAAAGAAAAAAAAGATTTGAATGTCGGCTGGGTTCACAGAGATGGTAAGCGCTAm 
CrmAATAATAGAGAAGAACAAGTGGGAACCGAACATGCTAAGAAAGTCATTGATATTAGTGAGCACAATGGTCG 
TATCAATGATTGGAAAAAGGTTATTGATGAGAACGAAGTGGATGGTGTCATTGTTCGTCTAGGTTATAGCGGTAA 
AGAAGACAAGGAATrGGCGCATAACATTAAGGAGTTAAACCGTCTGGGAATTCCTTATGGTGTCTATCTCTATAC 
CTATGCTGAAAATGAGACCGTGCTGAGAGTGACGCTAAACAGACCATTGAACTTATAAAGAAATACAATATGAAC 

10 CTGTCTTACCCTATCTATTATGATGTTGAGAATTGGGAATATGTAAATAAGAGCAAGAGAGCTCCAAGTGATACA 
GGCACTTGGGTTAAAATCATCAACAAGTACATGGACACGATGAAGCAGGCGGGTTATCAAAATGTGTATGTCTAT 
AGCTAT CGTA GTTTATTACAGACGCGTTTAAAACACCCAGATATTTTAAAACATGTAAACTGGGTAGCGGCCTATA 
CGAATGCTTTAGAATGGGAAAACCCTCATTATTCAGGAAAAAAAGGTTGGCAATATACCTCTTCTGAATACATGA 

^ ^ AAGGAATCCAAGGGCGCGTAGATGTCAGCGTTTGGTATTAA 

MKTKIGLASICLLGLATSHVAANETEVAiaSQD'rTTASSSSEQNQSSNKTQTSAEVQTNAAAHWDGDYYVKDDGSKAQ 
SEWIFDNYYKAWFVINSDGRYSQNEWHGNYYLKSGGYMAQNEWIYDSNYKSWFYLKSDGAYAHQEWQLIGNKWYY 
FKKWGYMAKSQWQGSYFLNGQGAMMQNEWLYDPAYSAYFYLKSDGTYANQEWQKVGGKWYYFKKWGYMARNE 
WQGNYYLTGSGAMATDEVIMDGTRYIFAASGELKEKKDLNVGWVHRDGKRYFFNNREEQVGTEHAKKVIDISEHNG 
20 RINDWKKVIDENEVDGVIVRLGYSGKEDKELAHNIKELNRLGIPYGVYLYTYAENETDAESDAKQTIELIKKYNMNLSY 
PIYYDVENWEYVNKSKRAPSDTGTWVKIINKYMDTMKQAGYQNVYVYSYRSLLQTRLKHPDILKHVNWVAAYTNAL 
EWENPHYSGKKGWQYTSSEYMKGIQGRVDVSVWYZ 

TD52 774bD 

25 

ATGAAAAAATTTGCCAACCmATCTGGGACTGGTCTTTCTG^ 

TGCCTTTAATGCTGGT GATGA TATGAATAGCnTTACAGGTITrAGCTGGACTCACTT^ 

GGGAXjACTCATGCTGATTTTGGCTCAGACATTT^ 

CTmGGTGCCATTTACATCTACCAGTCTCGTAAGAAATACCAAGAAGCCTTTCTATCACTCAATAA 
30 GGTTGCGCCTGACGTTATGATTGGTGCTAGCTTCTTGATTCTCTTTACCCAACrCAAGT^ 

CCGTTCTATCTAGTCACGTGGCCTTCTCCATTCCTATCGTGGTCTTGATGGTCTTGCCTCGACTCAAGGAAATGi^ 
TGGCGACATGATTCATGCGGCCTATGACTTGGGAGCTAGTCAATTTCAGATGTTCAAGGAAATCATGCITCCT^ 
CTGACTCCGTCT ATCATT ACTGGTTATTTCATGGCCTTCACCTATTCGTTAGATGACTTTGCCGTGA 
AACAGGAAATCGCTTTTCAACCCTATCAGTCGAGATT^ 
35 GCCCTGTCTGCTCTAGTCrTTCTCTTTAGTATTATCCTAGTTGTAGGTTATTAC^ 
GCAAGCATGA 

MKKFANLYLGLVFLVLYLPIFYLIGYAFNAGDDMNSFTGFSWTHFETMFGDGRLMLILAQTFFLAFLSALIATnGTFGA 
lYlYQSRKKYQEAFLSLNNILMVAPDVMIGASFLILFTQLKFSLGFLTVLSSHVAFSIPIVVLMVLPRLKEMNGDMIHAAY 
40 DLGASQFQMFKEIMLPYLTPSnXGYFMAFTYSLDDFAVTFFVTGNGFSTLSVEIYSRARKGISLEINALSALVFLFSlILVV 
GYYFISREKEEQAZ 

ID59 1071bD 

45 ATGAAAAAAATCTATTCATTTTTAGCAGGAATTGCAGCGATTATCCTTGTCTTGTGGGGAATrGCGACT 

ATAGTAAAATCAATAGTCGAGATAGTCAAAAATTGGTTATCTATAACTGGGGAGACTATATCGATCCTGAACTCTT 
GACTCAGTTTACAGAAGAAACAGGAATTCAAGTTCAGTACGAGACTTTTGACTCCAACGAAGCCATGTACACTAA 
GATAAAGCAGGGTGGAACGACCTACGATATTGCCATTCCAAGTGAATACATGATTAACAAGATGAAGGACGAAG 
ACCTCTTGGTTCCGCTTGATTATTCAAAAATTGAAGGAATCGAAAATATCGGACCAGAGTTTCTCAACCAGTCCT^ 

50 TGACCCAGGTAATAAATTCTCCAT CCCTT ACTTCrGGGGAACCTTAGGAATTGTCTACAACGAAACCA^^ 

GAAGCGCCTGAGCATTGGGATGACCTTTGGAAGCCGGAGTATAAGAATTCTATCATGCTCITTGATGGGGCGCGT 
GAGGTGCTGGGACTAGGACTCAATTCCCTCGGCTACAGCCTCAACTCCAAGGATCTGCAGCAGTTGGAAGAGACA 
GTGGATAAGCTCTACAAACTGACTCCAAATATCAAGGCTATCGTTGCGGACGAGATGAAGGGCTATATGATTCAG 
AATAATGTTGCAATCGGCGTGACC n-CTC TGGTGAAGCCAGCCAAATGTTAGAAAAAAATGAAAATCTACGTTAT 

55 GTGGTACCGACAGAGG CCAG CAATCTTTGGTTTGACAATATGGTCATTCCCAAAACAGTTAAAAACCAAAACTCA 
GCCTATGCCTTTATCAACTTTATGTTGAAACCTGAAAATGCTCTCCAAAATGCGGAGTATGTCGGCT^ 
CAAACCTACCAGCGAAGGAATTGCTCCCAGAGGAAACAAAGGAAGATAAGGCCTTCTATCCCGATGTTGAAACCA 
TGAAACACCTAGAAGTTTATGAGAAATTTGACCATAAATGGACAGGGAAATATAGCGACCTCTTCCTACAGTTTA 
AAATGTATCGGAAGTAG 

60 

MmYSFLAGIAAIILVLWGIATHLDSKINSRDSQKLVIYNWGDYIDPELLTQFTEETGIQVQYETFDSNEAMYTKIKQGG 
TTYDIAIPSEYMINKMKDEDLLVPLDYSKIEGIENIGPEFLNQSFDPGNKFSIPYFWGTLGIVYNETMVDEAPEHWDDLW 
KPEYKNSIMLFDGAREVLGLGLNSLGYSLNSKDLQQLEETVDKLYKLTPNIKAIVADEMKGYMIQNNVAIGVTFSGEAS 
QMLEKNENLRYVVPTEASNLWFDNMVIPKTVKNQNSAYAFINFMLKPENALQNAEYVGYSTPNLPAKELLPEETKED 
05 KAFYPDVETMKHLEVYEKFDHKWTGKYSDLFLQFKMYRKZ 
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ID61 1851bD 

ATGAATAAAAAACTAACAGATTATGTGATTGATCrGGTGGAAATm 

GGAATATTTGATATTTTCAGTATGGTGGTTTCCATCATTGTATCnTATATTTTAT^ 

ACCTGTTGACTACATTATCTATACGAGTTTGGCCTTCCTGTTCTATCAATTGATGATTGGT^^ 

CGAGCATTAGTCGTTACAGCAAGATTACGGATTTCATGAAAATCriTTTTGGTGTGACTGCT^ 

AT ATAG TATCTGTTATGCCTTCTTGCCACTCTTCTCCATCCGTTTCATCATTCTCT^ 

GATTITATTGCCACGGATTACnTGGCAGTTAATCTACTCCAGACGCAAAAAAGGTAGTGGTGATGGAGAACACCG 

TCGGACa'iril'G ATTGG TGCCGGTGATGGTGGGGCTCTTTTTATGGATAGTTACCAACATCCAACCAGTGA^ 

GAACTGGTCGGTATTTTGGATAAGGATTCTAAGAAAAAGGGTCAAAAACnTGGTGGTATTCCTGTm 

ATGACAATCTGCCTGAATTAGCCAAACGCCATCAAATCGAGCGTGTCATCGTTGCGATTCCGTCGCTGGATCCGTC 

AGAATATGAGCGTATCTTGCAGATGTGTAATAAGCTGGGTGTCAAATGTTACAAGATGCCTAAGGTTGAAACTGT 

TGTTCAGGGCCTTCACCAAGCAGGTACTGGCTTCCAAAAAATTGATATTACGGACCTT^ 

CGTCTTGACGAATCGCGTCTGGGTGCAGAACTGACAGGTAAGACCATCTTAGTCACAGGAGCTGGAGGTTCAATC 

GGTTCTGAAATCTGTCGTCAAGTTAGTCGCITCAATCCTGAACGCATTGTCTrGCTCGGTCATGGGGAAAA 

TCTACCTTGTTTATCATGAATTGATTCGTAAGTTCCAAGGGATTGATTATGTACCTGTGATTGCGGACATTCAAGA 

CTATGATCGTTTGTTGCAAGTCTTTGAGCAGTACAAACCTGCTATTGTTTATCATGCGGCAGCCCACAAGCA^ 

CCTATGATGGAGCGCAATCCAAAAGAAGCCTTCAAAAACAATATCCGTGGAACTTACAATGTTGCTAAGGCTGTT 

GATGAAGCTAAAGTGTCTAAGATGGTTATGATTTCGACAGATAAGGCAGTCAATCCACCAAATGTTATGGGAGCA 

ACCAAGCGCGTGGCGGAGTTGATTGTCACTGGCTTTAACCAACGTAGCCAATCAACCTACTGTGCAGTTCGTTT^ 

GGAATGTTCTTGGTAGCCGTGGTAGTGTCATTCCAGTCTTTGAACGTCAGATTGCTGAAGGTGGGCCTGTAACGGT 

GACAGACTTCCGTATGA CCCGT TACTTTATGACCATTCCAGAAGCTAGCCGTCTGGTTATCCATGCTGGTGOT 

GCCAAAGATGGGGAAGTCrrrATCCTTGATATGGGCAAACCAGTCAAGATTTATGACTTGGCCAAGAAGATGGTG 

CTTCTAAGTGGCCACACTGAAAGTGAAATTCCAATCGTTGAAGTTGGAATCCGCCCAGGTGAAAAACTCTACGAA 

GAAC TCTTGGTATCAACCGAACTCGTTGATAATCAAGTTATGGATAAGATTTTCGTTGGTAAGGTTAATGTCATGC 

CTTTAGAATCCATCAATCAAAAGATTGGAGAGTTCCGCACTCTCAGTGGAGATGAGTTGAAGCAAGCTATTATCG 

CCTTTGCTAATCAAACAACCCACATTGAATAA 

MNKKLTDWIDLVEILNKQQKQVFWGIFDIFSMVVSIWSYILFi'GUNPAPVDYIIYTSLAFLFYQLMIGFWGLNASISRV 

SKITDFMKIFFGVTASSVLSYSICYAFLPLFSIRFIILFILLSTFLILLPRITWQLIYSRRKKGSGDGEHRRTFLIGAGDGGALF 

MDSYQHPTSELELVGILDKDSKKKGQKLGGIPVLGSYDNLPELAKRHQIERVIVAIPSLDPSEYERILQMCNKLGVKCYK 

MPKVETVVQGLHQAGTGFQKIDITDLLGRQEIRLDESRLGAELTGKTILVTGAGGSIGSEICRQVSRFNPERIVLLGHGEN 

SIYLVYHELIRKFQGIDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAKAVD 

EAKVSKMVMISTDKAVNPPNVMGATKRVAELIVTGFNQRSQSTYCAVRFGNVLGSRGSVIPVFERQIAEGGPVTVTDFR 

MTRYFMTIPEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKMVLLSGHTESEIPIVEVGIRPGEKLYEELLVSTELV 

DNQVMDKIFVGKVNVMPLESINQKIGEFRTLSGDELKQAIIAFANQTTHIEZ 

IDlOl 1338bD 

ATGATTGAACTTTATGATAGTTACAGTCAAGAAAGTCGAGATTTACATGAAAGTCTAG 

AACTTGGAGTGGTCATCGATGCAGATGGTTTTCTGCCTGATGGTCTGCTTTCT 

GAGGATGGAAAACCTCTCTArmAATCAAGTTCCCGTTTCAGATTTTTGGGAAAT^ 

CITGTATTGAAGATGTGACGCAGGAGAGGGCTGTCATTCATTATGCTGATGGAATGCAGGCTCGCTTGGTT^ 

GGTAGACTGGAAAGACCTAGAAGGTCGAGTACGTCAGGTTGACCACTACAATCGCTTCGGAGCTTGTTTTGCT^ 

AACGACnTATAGCGCAGATAGCGAGCCGATTATGACAGTTTACCAAGATGTCAATGGTCAACAAGTTTTACTGGA 

AAACCATGTGAraGGTGATATCTTATTGACTTTGCCAGGTCAGTC^ 

ATCACCrTCTTTTTGCAAGATTTGGAAATAGATACCAGTCAGCITATCTTTAA^^ 

TTCCTTCCATCAT CCAG ATAAATCTGGCTCGGATGTCTTGGTATGGCAGGAACCTCTCTATGATGCCATTCCAG 

AATA TGCA GTTGATTTTGGAAAGTGATAATGTGCGTACTAAGAAGATCATCATTCCAAATAAGGCGACTTATGAG 

CGCGCTTTAGAGTTAACTGACGAGAAATACCATGATCAGTTTGTGCACnTGGGTTATCATTACCAGTTCAAACGTG 

ATAATTTCCTAAGACGAGATGCCTTAATCTTGACCAATTCAGATCAGATTGAGCAAGTAGAAGCAATCGCAGGAG 

CCTTGCCTGAT^CACTTTCCGTATTGCAGCGGTGACAGAGATGTCTTCT 

AATGTGGCCCTTTACCAGAACGCTAGTCCACAGAAGATTCAGGAGCTGTATCAACTGTCGGATATTTACn^ 

TAAACCACAGTAATGAGTTGCTACAGGCAGTGCGTCAGGCCTTTGAGCACAATCTCTTGATTCTTGGCTT^ 

GACGGTGCACA ATAGA CTTTATATCGCTCCAGACCATCTATTTGAAAGTAGTGAAGTTGCTGCrr^ 

ATTAAATTGGCCCTTTCAGATGTTGATCAAATGCGTCAGGCACITGGCAAACAAGGCCAACATGCAAATTATG^^ 

ACTTGGTGAGATATCAGGAAACCATGCAAACTGTTTTAGGAGGCTAA 

MIELYDSYSQESRDLHESLVATGLSQLGVVIDADGFLPDGLLSPFTYYLGYEDGKPLYFNQVPVSDFWEILGDNQSACIE 

DVTQERAVIHYADGMQARLVKQVDWKDLEGRVRQVDHYNRFGACFATTTYSADSEPIMTVYQDVNGQQVLLENHV 

TGDILLTLPGQSMRYFANKVEFITFFLQDLEIDTSQLIFNTLATPFLVSFHHPDKSGSDVLVWQEPLYDAIPGNMQLILES 

DNVRTKKIIIPNKATYERALELTDEKYHDQFVHLGYHYQFKRDNFLRRDALILTNSDQIEQVEAIAGALPDVTFRIAAVT 

EMSSKLLDMLCYPNVALYQNASPQKIQELYQLSDIYLDINHSNELLQAVRQAFEHNLLILGFNQTVHNRLYIAPDHLFE 

SSEVAALVETIKLALSDVDQMRQALGKQGQHANYVDLVRYQETMQTVLGGZ 
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wm mm 

ATG ACAAT TTACAATATAAATTTAGGAATTGGTTGGGCTAGTAGCGGTGTTGAATACGCTCAAGCCT^ 
5 GTGTTmCGGAAATTAAATCTGTCCTCTAAGTTTATCT^ 

ACAGCCAATATTGGTTTTGATGATAATCAGGTTATCTGGCTTTATAATCATTTCACAGATATCAAAA^^ 
CTAGCGTGACAGTGGATGATGTCITGGCTTACTT^ 

TACGTGTATTC TTTTTT GACCAAGATAAGTTTGTAACCTGTTATTTGGTTGATGAGAACAAGGAOT 
TGCCGAGTATGTTTTTAAGGGAAACCTGATTCGGAAGGATTACTTTTCTTATACGCGT^^ 

10 GCTCCCAAGGACAATGTTGCAGTCTTATACCAACGAACTTTTTATAATGAAGACGGGACrCCAGTCT^ 
T GATG AATCAA GGGAA GGAAGAAGTTTATCATTTCAAGGATAAGATTTTCTATGGAAAGCAAGCrm 
CCmATGAAATCTTTGAATTTGAATAAGTCTGATTTGGTCATTCTCGATAGGGAGACAGGTA^^ 
GTTrGAGGAA GCAC AGACAGCACATCTAGCGGTAGTTGTTCATGCGGAGCATTATAGTGAAAATGCTACAAATGA 
GGACTATATCCTTTGGAATAACTATTATGACTATCAGTTTACCAATGCAGATAAGGTTGACTTCm 

15 ACTGATAGACAAAATGAAGTTCTACAAGAGCAATTTGCCAAATATACTCAGCATCAGCCAAAGATTGTTACCATT 
CCrGTAGGCAGTATTGATTCCTTGACAGATTCAAGTCAAGGGCGCAAACCATTTTCATTGATTACGGOT 
TTGCCAAAGAAAAGCACATTGATTGGCTTGTGAAAGCTGTGATTGAAGCTCATAAGGAGTTACCGGAACTAACCT 
TTGATATCTATGGTAGTGGTGG AGAAG ATTCTCTGCTTAGAGAAATTATTGCAAATCATCAGGCAGAGGACTATAT 
CCAACTCAAGGGGCATGCGGAACmCGCAGATTTATAGCCAGTATGAGGTCTACTTAACGGCTTCTACCAGra 

20 AGGATTTCGTCrGACCTTGATGGAAGCTATTGGTTCAGGTCTACCTCTAATTGG 

AGACCTTTATAGAGGATGGGCAAAATGGTTAmGATTCCAAGrrCATCTGACCATGTAGAAGACCAAATCAAGC 

AAGCTTATGCCGCTAAGATTTGTCAATTGTATCAAGAAAATCGTTTGGAAGCTATGCGTGCCT^ 

TGCAGAAGGCTTCTTGACCAAAGAAATTTTAGAAAAGTGGAAGAAAACAGTAGAGGAGGTGCTCCATGATTGA 

25 MTIYNINLGIGWASSGVEYAQAYRAGVFRKLNLSSKFIFTDMILADNIQHLTANIGFDDNQVIWLYNHFTDIKIAPTSVT 
VDDVLAYFGGEESHREKNGKVLRVFFFDQDKFVTCYLVDENKDLVQHAEYVFKGNLIRKDYFSYTRYCSEYFAPKDN 
VAVLYQRTFYNEDGTPVYDILMNQGKEEVYHFKDKIFYGKQAFVRAFMKSLNLNKSDLVILDRETGIGQVVFEEAQTA 
HLAVVVHAEHYSENATNEDYILWNNYYDYQFTNADKVDFFIVSTDRQNEVLQEQFAKYTQHQPKIVTIPVGSIDSLTDS 
SQGRKPFSLITASRLAKEKHIDWLVKAVIEAHKELPELTFDIYGSGGEDSLLREIIANHQAEDYIQLKGHAELSQIYSQYE 

30 VYLTASTSEGFGLTLMEAIGSGLPUGFDVPYGNQTFIEDGQNGYLIPSSSDHVEDQIKQAYAAKICQLYQENRLEAMRA 
YSYQIAEGFLTKEILEKWKKTVEEVLHDZ 



35 



tDlQ3 2292bo 

ATGTCCTCTCmCGGATCAAGAATTAGTAG 

40 TAGACGATATTTTGGTTGAAGCTTTTGCTGTGGTGCGTGAAGCAGATAAGCGGATTITAGGGATGm 

TGTTCAAGTCATGGGAGCTATTGTCATGCACTATGGAAATGTTGCTGAGATGAATACGGGGGAAGGTAAGACCTT 
GACAGCTACCATGCCTGTCTATTTGAACGCTTTTTCAGGAGAAGGAGTGATGGTTGTGACTCCTAATC 
TCAAAGCGTGATGCCGAGGAAATGGGTCAAGTTTATCGTTTTCTAGGATTGACCATTGGTGTACCATTTACGGAAG 
ATCCAA AGAAG GAGATGAAAGCTGAAGAAAAGAAGCTTATCTATGCTTCGGATATCATCTACACAACCAATAGTA 

45 ATTTAGGTTTTGATTATCTAAATGATAACCTAGCCTCGAATGAAGAAGGTAAGTTTITACGACCGTT^ 

GATTATTGATGAAATTGATGATATCTTGCTTGATAGTGCACAAACTCCTCTGATTATTGCGGGTTCrCCTCGTGTTC 
AGTCTAATTACTATGCGATCATTGATACACTTGTAACAACCTTGGTCGAAGGAGAGGATTATATCT^ 
GAAAGAGGAGGTTTGGCTCACTACTAAGGGGGCCAAGTCrGCTGAGAATTTCCTAGG 
GGAAGAGCATGCGTCTTTTGCrCGTCATTTGGTTTATGCGATTCGAGCTCATAAGCTCm 

50 TATATCATTCGTGGAAATGAGATGGTACTGGTTGATAAGGGAACAGGGCGTCTAATGGAAATGACTAAACTTCAA 
GGAGGTCTCCAT CAGGCT ATTGAAGCCAAGGAACATGTCAAATTATCTCCTGAGACGCGGGCTATGGCCTCGATC 
ACCTATCAGAGTCTTTTTAAGATGTTTAATAAGATATCTGGTATGACAGGGACAGGTAAGGTCGCGGAAAAAGAG 
TTTATTGAAACTTACAATATGTCTGTAGTACGCATTCCAACCAATCGTCCGAGACAACGGATTGACTATC^ 
ATCTATA TATCAC TTTACCTGAAAAAGTGTATGCATCCTTGGAGTACATCAAGCAATACCATGCTAAGGGAAATCC 

55 TTTACTCGTTTTTGTAGGCTCAGTTGAAATGTCTCAACTCTATTCGTCTCTCTTGT^ 

ATGTCCTAAATGCTAATAATGCGGCGCGTGAGGCTCAGATTATCTCCGAGTCAGGTCAGATGGGGGCTGTGACAG 
TGGCTACCTCTATGGCAGGACGTGGTACGGATATCAAGCTTGGTAAAGGAGTCGCAGAGCTTGGGGGCTTGATTG 
TTATTGGGACTGAG CGGATG GAAAGTCAGCGGATCGACCTACAAATTCGTGGCCGTTCTGGTCGTCAGGGAGATC 
CTGGTATGAGTAAATTTTTTGTATCCTTAGAGGATGATGTTATCAAGAAATTTGGTCCATCT^ 

OO GTACAAAGACTATCAGGTTCAAGATATGACTCAACCX3GAAGTATTGAAAGGTCGTAAATACCGGAAACTAGTCGA 
AAAGGCTCAGCATGCCAGTGATAGTGCTGGACGTTCAGCACGTCGTCAGACTCTGGAGTATGCTGAAAGTATGAA 
TATACAACGGGATATAGTCTATAAAGAGAGAAATCGTCTAATAGATGGTTCTCGTGACnTAGAGGATGTTGTTGTG 
GATATCATTGAGA GATA TACAGAAGAGGTAGCGGCTGATCACTATGCTAGTCGTGAATTATTGTTTCACTT^ 
TGACCAATATTAGTTTTCATGTTAAAGAGGTTCCAGATTATATAGATGTAACTGACAAAACTGCAGTTCGTAGC^ 

05 TATGAAGCAGGTGATTGATAAAGAACTTTCrGAAAAGAAAGAATTACTTAATCAACATGACTTATATGAACAG 
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TTTACGACTTTCACTGCTTAAAGCCATTGATGACAACTGGGTAGAGCAGGTAGACTATCTACAACAGCT^ 
GCTATCGGTGGTCAATCTGCTAGTCAGAAAAATCCAATCGTAGAGTACTATCAAGAAGCCTACGCGGGCTTTGAA 
GCTATGAAAGAACAGATTCATGCGGATATGGTGCGTAATCTCCTGATGGGGCTGGTTGAGGTCACTCCAAAAGGT 
^ GAAATCGTGACTCATTTTCCATAA 

MSSLSDQELVAKTVEFRQRLSEGESLDDILVEAFAVVREADKRILGMFPYDVQVMGAIVMHYGNVAEMNTGEGKTLT 
ATMPVYLNAFSGEGVMVVTPNEYLSKRDAEEMGQVYRFLGLTIGVPFTEDPKKEMKAEEKKLIYASDIIYTTNSNLGF 
DYLNDNLASNEEGKFLRPFNYVIIDEIDDILLDSAQTPLIIAGSPRVQSNYYAIIDTLVTTLVEGEDYIFKEEKEEVWLTTK 
GAKSAENFLGIDNLYKEEHASFARHLVYAIRAHKLFTKDKDYIIRGNEMVLVDKGTGRLMEMTKLQGGLHQAIEAKEH 

lU VKLSPETRAMASITYOSLFKMFNKISGMTGTGKVAEKEFIETYNMSVVRIPTNRPRQRIDYPDNLYITLPEKVYASLEYDC 
QYHAKGNPLLVFVGSVEMSQLYSSLLFREGIAHNVLNANNAAREAQIISESGQMGAVTVATSMAGRGTDIKLGKGVAE 
LGGLIVIGTERMESQRIDLQIRGRSGRQGDPGMSKFFVSLEDDVIKKFGPSWVHKKYKDYQVQDMTQPEVLKGRKYRK 
LVEKAQHASDSAGRSARRQTLEYAESMNIQRDIVYKERNRLIDGSRDLEDVVVDUERYTEEVAADHYASRELLFHFIVT 
NISFHVKEVPDYIDVTDKTAVRSFMKQVIDKELSEKKELLNQHDLYEQFLRLSLLKAIDDNWVEQVDYLQQLSMAIGG 

15 QSASQKNPIVEYYQEAYAGFEAMKEQIHADMVRNLLMGLVEVTPKGEIVTHFPZ 

ID104 879bD 

ATGAAACAAGAATGGTrrGAAAGTAATGATTTTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTCA 
IV AGAGGTTGCAGACAAGGCTGAAGAAAGGATACCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCTCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAGACAGTGAA 
GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAATTCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCA CTAT AGCTGAAGAGAGCCAAGAAGCTCTTCCTCAGCAAAAAGCAACCACGAAAGAGCCACTTCTTAT 
CAG TAAAT CTTTAGAAAGTCCTTATATCCCCGACCAAGCTCCAAAATCTAGGGATAAATGGAAAGAGCAAGTGCT 
ID TGATTTTTGGTCTTGGCTAGTGGAAGCGATCAAATCTCCTACAAGTAAGTTGGAAACAAGTATCACAC 
ACAGCCriTCTCTTGCTCATTCTGTTTTCTGCATCTTCCTTmC^ 
GGACATATAGCAAGCATTAA'CAOTCGCTTCCCT^ 

AGTAGCGACAACACTCTTCTTCTTTTCATTCCTCTTGGGTAGTTTCGTTGTGAGACGAm 
CSACTGGACGCTAGACAAGGTTCTCCAACAATATAGTCAACrCTTGGCAATTCCAATCTCCrCACTGCT 
JU TTTCTTTGCTTTCTTTGATAGCCTACGATTTACAGCCCTCTTGTGTGTGA 

MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEERIPDLDTPIEKNTQLEEEVSQAEVELESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDl^KETEKVTIAEESOEALPQQKATTKEPLLISKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAIKS 
PTSKLETSrrHSYTAFLLLU-FSASSFFFSIYHlKHAYYGHIASINSRFPEQlJ\PLTLFSIISlLVATTLFFFSFLLGSFVVRRFIH 
J J QEKD WTLDK VLQQ YSQLLAIPISSLLLLVSLLSLI AYDLQPSC V2 

1D106 327bp 

ATGTACTTTCCAACATCCTCTGCCTTGATTGAATTTCTCATCTTGGCTGTACTGGAGCAGGGTO 
4U TGAGATTAGCCAAACCATTAAGCTGATCGCTAATATCAAAGAATCCACACTCTATCCCATTCTCAAAAAATO^ 
AGGCAATAGCTTrCTGACAACCrATTCTAGAGAGTTCCAAGGTCGCATGCGCAAATACTACTCCTTGACAAA 
TGGTATAGAGCAGCTCTTGACCCTAAAAGATGAATGGGCACTCTATACAGACACCATCAATGGCATCATAGAAGG 
GAGTATCCGCCATGACAAGAACTGA 

45 MYFPTSSALIEFULAVLEQGDSYGYEISQTIKLIANIKESTLYPIUCKLEGNSFLTTYSREFQGRMRKYYSLTNGGIEQLLT 
LKDEWALYTDTINGIIEGSIRHDKNZ 

ID^p8 9S4bp 

ATGGATTTTGAAAAAATTGAACAAGCTT 

CCAACTTTTATGACGCCTTGGTGGAGCAAAATAGCATCTATCTGGATGGTGAAACTGAGCTAAACCAGGTCAAAG 
ACAACAATCAGGCCCTTAAGCGTTTAGCACTACGCAAAGAAGAATGGCTCAAGACCTACCAGTTTCTCTTGATGA 
AGGCTGGGCAAACAGAACCCrTGCAGGCCAATCACCAGTTTACACCGGATGCTATTGCTTTGCTTT^ 
TGTGGAAGAGTTGrrTAAAGAGGAGGAAATTACTATCCTCGAAATGGGTTCTGGGATGGGAATTCTAGGCGCTAT 
3D 1 1 1 C I IG ACCTCGCTTACTAAAAAGGTGG ATTACTTGGG AATGG AAGTGG ATG ATTTGCTG ATTG ATCTGGC AGCT 

AGCATGGCAGATGTAATTGGTTTGCAGGCTGGCTTTGTCCAAGGAGATGCCGTTCGCCCACAAATGCTCAAAGAA 
AGCGATGTGGTCATCAGTGACrrGCCTGTCGGCTArrATCCTGATGATGCCGrrGCGTCGCGCCATCAAGTTGCTT 

CTATTTTTCTAGCTCCGAGTGATTTGTTGACCAGTCCTCAAAGTGATTTGTTAAAA^ 
OU GAGTCTGGTTGCTATGATTAGTCTGCCTGAAAATCTCTTTGCTAATGCCAAACAATCTAAGACTAT^^ 
AGAAGAAAAATGAAATAGCAGTAGAGCCITTTGTTTATCCAOT 
ATTTAAAGAAAATTTTCAAAAATGGACTCAAGGTACTGAAATATAA 

MDFEKIEQAYIYLLENVQVIQSDLATNFYDALVEQNSIYLDGETELNQVKDNNQALKRLALRKEEWLKTYQFLLMKA 
03 GQTEPLQANHQFTPDAIALLLVFIVEELFKEEEITILEMGSGMGILGAIFLTSLTKKVDYLGMEVDDLUDLAASMADVI 
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GLQAGFVQGDAVRPQMLKESDVVISDLPVGYYPDDAVASRHQVASSQEHTYAHHLLMEQGLKYLKSDGYAIFLAPSD 
LLTSPQSDLLKEWLKEEASLVAMISLPENLFANAKQSKTIFILQKKNEIAVEPFVYPLASLQDASVLMKFKENFQKWTQG 
TEIZ 

5 IDl^O I902bp 

ATGATTATTTTACAAGCTAATAAAATTGAACGTTCTTTTGCAGGAGAGGTTCTTTT^^ 
TTGATGAACGAGATCGGATTGCTCTTGTTGGGAAAAATGGTGCAGGTAAGTCTACrCITTTGAAGAT^ 
AGAA GAGGAGCCAACTAGCGGAGAAATCAATAAGAAAAAAGATATTTCTCTGTCTTACCTAGCCCAAGATAGCCG 
10 TTTTGAGTCTGAAAATACCATCTACGATGAAATGCTTCATGTCrrrAATGATTTGCGTCG^^ 

CGTCAGAT GGAG CTGGAGATGGGTGAAAAGTCTGGTGAGGATTTGGATAAACTGATGTCAGATTATGACCGCTTA 
TCTGAGAATTTTCGCCAAGCAGGTGGCTTTACCTATGAAGCTGATATTCGAGCGATTT^ 

ACGAGTCTATGTGGCAGATGAAAATTGCTGAGCTTTCTGGTGGTCAAAATACTCGTTTGGCAOT . 
CCTTGAAAAGCCCAATCTCTTGGTOTGGACGAGCCAACTAACCACTTGGATATTGAAACCAT^^ 

15 GAATTACTTGGTAAACTATAGCGGTGCCCrCATTATCGTCAGCCACGACCGTTATTTCITGGACAA^ 
ATTACGCTAGATTTGACCAAGCATTCCTTGGATCGCTATGTGGGGAATTACTCTCGTTTTGTC^ 
AAAAGCTAGTTACTGAGGCAAAAAACTATGAAAAGCAACAGAAGGAAATCGCTGCTCTGGAAGACnTTGTCAATC 
GCAATCTAGTTCGTGCTTCAACGACTAAACGTGCTCAATCTCGCCGTAAACAACTAGAAAAAATGGAGCGTTTC 
ACAAGCCTGAAGCTGGCAAGAAAGCAGCCAACATGACCTTCCAGTCTGAAAAAACGTCGGGCAATGTTGTT^ 

20 CTGTTGAAAATGCAGCTGTTGGCTATGACGGGGAAGTCTTGTCACAACCTATCAACCTAGATCTTCGTAAGA^^ 
TGCTGTCGCTATCGTTGGTCCAAATGGTATCGGCAAGTCAACCTTTATCAAGTCTATTGTGGACCAGATTCCTm 
ATCAAGGGAGAAAAGCGCTTTGGCGCTAATGTTGAGGTTGGTTACTATGACCAAACCCAAAGCAAGCTGACACCA 
AGTAAT ACGGT GCTGGATGAACTCTGGAATGATTTCAAACTGACACCAGAAGTTGAAATCCGCAACCGTCTTGGA 
GCCTTCCTTTTCTCAGGAGATGATGTTAAAAAATCAGTCGGCATGCTATCTGGTGGCGAAAAAGCTCGm 

25 TAGCTAAATTGTCTATGGAAAACAATAACTTTTTGATTCTGGATGAGCCGACCAACCACTTGGATA 
GGAAGTGCTAGAAAA TGCCT TGATTGACTTTGATGGAACCTTGCTGTTTGTCAGTCATGATCG^ 
CGTGTGGCAACTCATGTTTTGGAATTGTCTGAGAATGGTTCAACTCTCTACCTTGGAGATTACGACT^ 
AGAAGAAAGCAACAGCAGAAATGAGTCAGACTGAGGAAGCTTCAACTAGCAATCAAGCAAAGGAAGCAAGTCCA 
GTCAATGACTATCAGGCCCAGAAAGAAAGTCAAAAAGAAGTTCGCAAACTCATGCGACAAATCGAAAGTCTAGA 

30 AGCTGAAATTGAAGAGCTAGAAAGTCAAAGCCAAGCCATTTCTGAACAAATGTTGGAAACAAACGATGCCGACA 
AACTCATGGAATTACAGGCTGAGCTGGACAAAATCAGCCATCGTCAGGAAGAAGCTATGCTTGAGTGGGAAGAAT 
TATCAGAGCAGGTGTAA 

MIILQANKIERSFAGEVLFDNINLQVDERDRIALVGKNGAGKSTLLKILVGEEEPTSGEINKKKDISLSYLAQDSRFESENT 
35 lYDEMLHVFNDLRRTERQLRQMELEMGEKSGEDLDKLMSDYDRLSENFRQAGGFTYEADIRAILNGFKFDESMWQMK 
lAELSGGQNTRLALAKMLLEKPNLLVLDEPTNHLDIETIAWLENYLVNYSGALUVSHDRYFLDKVATITLDLTICHSLDR 
YVGNYSRFVELKEQKLVTEAKNYEKQQKEIAALEDFVNRNLVRASTTKRAQSRRKQLEKMERLDKPEAGKKAANMTF 
QSEKTSGNVVLTVENAAVGYDGEVLSQPINLDLRKMNAVAIVGPNGIGKSTFIKSIVDQIPFIKGEKRFGANVEVGYYDQ 
TQSKLTPSNTVLDELWNDFKLTPEVEIRNRLGAFLFSGDDVKKSVGMLSGGEKARLLLAKLSMENNNFLILDEPTNHL 
40 DIDSKEVLENALIDFDGTLLFVSHDRYFINRVATHVLELSENGSTLYLGDYDYYVEKKATAEMSQTEEASTSNQAKEAS 
PVNDYQAQKESQKEVRKLMRQIESLEAEIEELESQSQAISEQMLETNDADKLMELQAELDKISHRQEEAMLEWEELSEQ 



45 



IDlll 1179bD 



ATGAATCGCTATGCAGTGCAGTTGATTAGCCGTGGGGCTATCAATAAAATGGGAAATATGCTCTATGATTATGGA 
AATAGTGTCTGGTTGGCTT CTAT GGGGACTATAGGACAGACAGTTTTAGGAATGTATCAGATTTCTGAGCTCGT^^ 
CATCTATTCTCGTCAATCCCTTTGGCGGAGTTATTTCAGACCGTTTTTCT^ 
CrrGTTTGTGGGATTCmCTCITOCTATTTCTT^ 
50 TAACATTGTGCAGGCTATTGCITTTGCCTTTTCTCGCACAGCCAATAAAGCTATC^^ 

GATG AGATT GTGATCTATAATTCTCGCTTAGAGCTGGTTTTGCAGGTTGTAGGTGTTAGCTCTCCTGTTCTTT 
CCTTWTTTACAGTTTGCAAGTCTCCATATGACGCTACTGCTAGACT^ 

TGGCTTTCCTTCCAAAAGAGGAAGCAAAAGTTCAAGAGAAAAAGGCTTTTACTGGGAGAGATAT^^ 
C TCAAGGA TGG GTTACA CTATATCTGGCATCAGCAAGAAATTTTCITCCTTTTG 
55 CTTTTTTGCAGCTTTTGAATTTCTACTTCCCr^ 
TAACTATGGGGGCTjATTGGTTCCA^ 

GATTTTACTGGCTTrGACAGGTGTCGGAGTTTTTATGATGGGATTACCACTTCCAAC ^ ^ 
ATTTAGTTTGTGAATTGTTTATGAC GATTTT TAATATTCACTrTm 
TTTCTTGGAAGAGTACTGAGTACAATTTTTACCTTAGCTATTCTATTTATGC 
OU CrTGCCAAGTGTCCATCTTTATTCTTTCTTGATTATTGGACTTGGAGTTGTAGCOT 
ATGTTCGAACTCATTTTGAAAAATTGATATAA 

MNRYAVQLISRGAINKMGNMLYDYGNSVWLASMGTIGQTVLGMYQISELVTSILVNPFGGVISDRFSRRKILMTADLV 
CGILCLAISFIRNDSWMIGALIVANIVQAIAFAFSRTANKAirrEVVEKDEIVIYNSRLELVLQVVGVSSPVLSFLVLQFASL 
OD HMTLLLDSLTFFIAFVLVAFLPKEEAKVQEKKAFTGRDIFVDIKDGLHYIWHQQEIFFLLLVASSVNFFFAAFEFLLPFSN 
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QLYGSEGAYASILTMGAIGSIIGALLASKIKANIYNLLILLALTGVGVFMMGLPLPTFLSFSGNLVCELFMTIFNIHFFTQV 
QTKVESEFLGRVLSTIFTLAILFMPIAKGFMTVLPSVHLYSFLIIGLGVVALYFLALGYVRTHFEKLIZ 

^ ID113 2466bD 

atgcaaaatcaattaaatgaattaaaacgaaaaatgctggaatttttccagcaaaaacaaaaaaata;^^ 

gctagacctggcaagaaaggttcaagtaccaaaaaatctaaaaccttagataagtcagccattttcccagctat^ 

ttvtctgagtataaaagcctt 

(nttgggatacggagtggccttatttgacaaggtrcgggtgcctcagacagaagaattggtgaatcaggtcaagg 

10 acatctcttctatttcagagattacctattcggacgggacggtgattgcttccatagagagtgam 

ttctatctcatctgagcaaatttcggaaaatctgaagaaggctatcattgcgacagaagatgaacacm 
acataagggtgtagtacccaaggcggtgattcgtgcgaccitggggaaatttgtaggtttgggttcctctagtg^ 
ggttcaaccttgacccagcaactaattaaacagcaggtggttggggatgcgccgaccttggctcgtaaggcggc^ 
g agat tgtggatgctcitgccitggaacgcgccatgaataaagatgagattttaacgacctatctcaa 

15 cctttggccgaaataataagggacagaatattgcaggggctcggcaagcagctgagggaattttcggtg^ 
ccagtcagttgactgttcctcaagcagcatttttagcaggacttccacagagtcccattact^ 
aaatactggggagttgaagagtgatgaagacctagaaattggcrraagacgggctaaggcagttctttacagtat 
gtatcgtacaggtgcattaagcaaagacgagtattctcagtacaaggattatgaccrraaacaggacttt^ 
atcgggcacggttacaggaatttcacgagactatttatactttacaactttggcagaagct^ 

20 gactatctagctcagagagacaatgtctccgctaaggagttgaaaaatgaggcaactcagaagttttatc^^ 

ttggcagccaaggaaattgaaaatggtggttataagattactactaccatagatcagaaaattcattctgccatg 
caaagtgcggttgctgattatggctatcttttagacgatggaacaggtcgtgtagaagtagggaatgtcttg^^^ 
gataaccaaacaggtgctattctaggcntrgtaggtggtcgtaattatcaagaaaatcaaaataatcatgcc^ 
ataccaaacgttcgccagcttctactaccaagcccttgctggcctacggtattgctattgaccagggcttgatc 

25 aagtgaaacgattctatcraactatccaacaaactttgctaatggcaatccgattatgtatgct^ 

aacaggaatgatgaccttgggagaagctctgaactattcatggaatatccctgcttactggacctatcgtat^ 

cgtgaaaagggtgttgatgtcaagggttatatggaaaagatgggttacgagattcctgagtacggtattgagagc 

ttgccaatgggtggtggtattgaagtcacagttgcccagcataccaatggctatcagaccttagctaataatgga 

GTTTATCATCAGAAGCATGTGATTTCAAAGATTGAAGCAGCAGATGGTAGAGTGGTGTATGAGTATCAGGATAAA 
30 CCGGTTCAAGTCTATTCAAAAGCTACTGCGACGATTATGCAGGGATTGCTACGAGAAGTTCTATCCTCTCGTGTGA 
CAACAACCTTCAAGTCrAACCTGACTTCTTTAAATCCTACTCTGGCTAATGCAGATTGGATTGGG 
AACCAACCAAGACGAAAATATGTGGCTCATGCTTTCGACACCTAGATTAACCCTAGGTGGCTGGATTGGGCATGA 
TGATAATCATTCATTGTCACGTAGAGCAGGTTATTCTAATAACTCTAATTACATGGCTCATCTGGTAAATGCGATT 
CAGCAAGCTTCCCCAAGCATTTGGGGGAACGAGCGCTTTGCTTTAGATCCTAGTGTAGTGAAATCGGAAGTCT^ 
35 AAATCAACAGGTCAAAAACCAGAGAAGGTTTCrGTTGAAGGAAAAGAAGTAGAGGTCACAGGTTCGACTGTTACC 
AGCTATTGGGCTAATAAGTCAGGAGCGCCAGCGACAAGTTATCGCnTrGCTATTGGCGGAAGTGATGCGGATTAT 
CAGAATGCTTGGTCTAGTATTGTGGGGAGTCTACCAACTCCATCCAGCTCCAGCAGTTCAAGTAGTAGTTCTAGCG 
ATAGCAGTAACTCAAGTACTACACG ACC 1 1 C 1 1 CTl CAAGGGCG AGACGATAA 

40 MQNQLNELKRKMLEFFQQKQKNKKSARPGKKGSSTKKSKTLDKSAIFPAILLSIKALFNLLFVLGFLGGMLGAGIALGY 
GVALFDKVRVPQTEELVNQVKDISSISEnrSDGTVIASIESDLLRTSISSEQISENLKKAIIATEDEHFKEHKGVVPKAVlR 
ATLGKFVGLGSSSGGSTLTQQLIKQQVVGDAPTLARKAAEIVDALALERAMNKDEILTTYLNVAPFGRNNKGQNIAGA 
RQAAEGIFGVDASQLTVPQAAFLAGLPQSPITYSPYENTGELKSDEDLEIGLRRAKAVLYSMYRTGALSKDEYSQYKDY 
DLKQDFLPSGTVTGISRDYLYFTTLAEAQERMYDYLAQRDm^SAKELKNEATQKFYRDLAAmENGGYKITTTIDQKI 

45 HSAMQSAVADYGYLLDDGTGRVEVGNVLMDNQTGAILGFVGGRNYQENQNNHAFDTKRSPASTTKPLLAYGIAIDQG 
LMGSETILSNYPTNFANGNPIMYANSKGTGMMTLGEALNYSWNIPAYWTYRMLREKGVDVKGYMEKMGYEIPEYGIE 
SLPMGGGIEVTVAQHTNGYQTLANNGVYHQKHVISKIEAADGRVVYEYQDKPVQVYSKATATIMQGLLREVLSSRVTT 
TFKSNLTSLNPTLANADWIGKTGTTNQDENMWLMLSTPRLTLGGWIGHDDNHSLSRRAGYSNNSNYMAHLVNAIQQA 
SPSIWGNERFALDPSVVKSEVLKSTGQKPEKVSVEGKEVEVTGSTVTSYWANKSGAPATSYRFAIGGSDADYQNAWSSI 

50 VGSLPTPSSSSSSSSSSSDSSNSSTTRPSSSRARRZ 

ATGAAAAAATTTTATGTAAGTCCAATTTTTCCTATTCTAGTAGGATO 

55 TATTTTTGTTAATAATAATCrGTTGACGGTTTTAATTTTG II'ILU^I-IM GTAGGAGGCTATGTTTTTTTATTTAAGAA 

ACTGAGAGTGCATTATACAAGGAGTGATGTAGAACAGATACAGTATGTAAACCACCAAGCGGAAGAAAGTTTGAC 
AGCTCTATTGGAACAGATGCCTGTAGGTGTTATGAAATTGAATTTATCT^ 
TATGCTGAATTGATTTTGACCAAGGAAGATGGTGATTTTGATTTAGAAGCrGTTCAAACGA™ 
TAGGAAATCCGTCTACTTATGCCAAGCTTGGTGAGAAGCGTTATGCTGT^ 

60 GTATTTTGTAGATGTATCCAGGGAACAAGCCATAACAGATGAATTGGTAACAAGTAGACCAGTGATTGGGATTGT 
CT CTGT GGATAATTA TGAT GATTTGGAGGATGAAACTTCTGAGTCAGATATTAGTCAAATCAATAGTTTTGTAGCT 
AATTTTATATCAGAGTTTTCAGAAAAACACATGATGTTTTCTCGTCGGGTAAGTATGGATCGATTTTA^^ 
TGACTACACGGTGCTTGAGGGCTTGATGAATGATAAATTTTCTGTTATTGATGCm 

AGACAGTTGCCCTTGACCnTAAGTATGGGATTTTCTTATGGCGATGGAAATCATGATGAGATAGGGAAAG^^ 
55 TGCTCAATTTGAACITGGCTGAAGTACGTGGTGGCGACCAGGTGGTTGTTAAGGAAAACGACGAAACGAAAAATC 
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CAGTTTATTTTGGTGGTGGGTCTGCTGCTTCAATCAAGCGTACACGGACTCGTACGCGCGCTATGATGAC^^ 
TTCAGATAAGATTCGGAGTGTAGATCAGGTTirrGTAGTCGGTCACAAAAATTTAGACATGGATGCTT^ 
GCTGTAGGTATGCAGTTGTTCGCCAGCAATGTGATTGAAAATAGCTATGCTCTTTATGATGAAGAACAAATGTCT^ 
CAGATATTGAACGAGCTG TTTC ATTCATAGAAAAAGAAGGAGTTACGAAGTTGTTGTCTGTTAAGGATGCAATGG 
5 GGATGGTGACCAATCGTTCnTrGTTGATTCTTGTAGACCATTCAAAGACAGCCTTAACATTATCAA^^ 

TGATTTATTTACCCAAACCATTGTTATTGACCACCATAGAAGGGATCAGGATTTTCCAGATAATGCGGTTAT^^ 
TATATCGAAAGTGGTGCAAGTAGTGCCAGTGAGTTGGTAACGGAATTGATTCAGTTCCAGAATTCTAAGAAAAAT 
CGTTTGAGTCGTATGCAAGCAAGTGTCTTGATGGCTGGTATGATGTTGGATACTAAAAATTTCACCrCGCGAGT^ 
CTAGTC GGAC ATrrGATGTTGCTAGCTATCTCAGAACGCGCGGAAGTGATAGTATTGCTATCCAGGAAATCGCTG 

10 GACAGATTTTGAAGAATATCGTGAGGTCAATGAACTTATTTTACAGGGGCGTAAATTAGGTTCAGATGT^ 

GCAGAGGCTAAGG ACATG AAATGCTATGATACAGTTGTTATTAGTAAGGCAGCAGATGCCATGTTAGCCATGTCA 
GGTATTGAAGCGAGTTTTGTTCTTGCGAAGAATACACAAGGATTTATCTCTATCTCAGCTCGAAG 
TGAATGTACAACGGATTATGGAAGAGTTAGGCGGTGGAGGCCACTITAATTTGGCAGCAGCTCAAATTAAAGATG 
TAACCTTGTCAGAAGCAGGTGAAAAACTGACAGAAATTGTATTAAATGAAATGAAGGAAAAGGAGAAAGAAGAA 

15 TGA 

MKKFYVSPIFPILVGLIAFGVLSTFIIFVNNNU-TVULFLFVGGYVFLFKKLRVHYTRSDVEQIQYVNHQAEESLTALLE 

QMPVGVMKLNLSSGEVEWFNPYAELILTKEDGDFDLEAVQTIIKASVGNPSTYAKLGEKRYAVHMDASSGVLYFVDVS 

REQAITDEL\TSRPVIGIVSVDNYDDLEDETSESDISQINSFVANRSEFSEKHMMFSRRVSMDRFYLFTDYTVLEGLMN 

20 DKFSVIDAFREESKQRQLPLTLSMGFSYGDGNHDEIGKVALLNLNLAEVRGGDQVVVKENDETKNPVYFGGGSAASIK 
RTRTRTRAMMTAISDKIRSVDQVFVVGHKNLDMDALGSAVGMQLFASNVIENSYALYDEEQMSPDIERAVSFIEKEGV 
TKLl^VKDAMGMVTNRSLULVDHSKTALTLSKEFYDLFTQTIVIDHHRRDQDFPDNAVITYIESGASSASELVTEUQFQ 
NSKKNRLSRMQASVLMAGMMLDTKNFTSRVTSRTFDVASYLRTRGSDSIAIQEIAATDFEEYREVNELILQGRKLGSDV 
LIAEAKDMKCYDTVVISKAADAMLAMSGIEASFVLAKNTQGHSISARSRSKLNVQRIMEELGGGGHFNLAAAQIKDVT 

25 LSEAGEKLTEIVLNEMKEKEKEEZ 

IDUS 663bD 

atgaagt gcttg ttatgtgggcagactatgaagactgttttaacnttragtagtctct 
30 actcttgtctttgttcagactgtgattctacttttgaaagaattggggaagagaact^ 

agagttgtcaacaaagtgtcaag attgtca actttggtgtaaagagggagttgaagtcagtcatagagcgatttt 
tactta caatc aagctatgaaggattititcagtcggtat'aagtttgatggagacttcctgt^^ 
gcttca i i r i raagtg agg agttgaaaaagtacaaag agtatcaatrrgttgtaattcccctaagtcctgatagat 
atgctaatagaggatttaatcaggttgagggcttggtagaggcagcaggctttgagtatctggatttattagaga 
35 aaagagaagagagagccagttcttctaaaaatcgttcagagcgcntggggacagaacttccttrcm 
giggagtcactattcctaaaaaaatcctacttatagatgatatctatactac^ggagcaactataaatc^ 

GAAACTGTTGGAAGAAGCTGGTGCTAAGGATGTAAAAACATTTTCCCTTGTAAGATGA 

MKCLLCGQTMKTVLTFSSLLIXRNDDSCLCSDCDSTFERIGEENCPNCMKTELSTKCQDCQLWCKEGVEVSHRAIFTY 
40 NQAMKDFFSRYKFDGDFLLRKVFASFLSEELKKYKEYQFVVIPLSPDRYANRGFNQVEGLVEAAGFEYLDLLEKREER 
ASSSKNRSERLGTELPFFIKSGVTIPKKILLIDDIYTTGATINRVKKLLEEAGAKDVKTFSLVRZ 

ID116 1299bD 

45 ATGAAAGTAAATTTAGATTATCTCGGTCGTTTATTTACTGAGAATGAATTAACAGAAGAAGAACGTCAG^^ 
GAGAAACTTCCAGCAATGAGAAAGGAGAAGGGGAAACTTTTCTGTCAACGCTGTAATAGTACTA^ 
TGGT ATrT GCCCATCGGTG CTTA CTATTGTCGAGAGTGmGCTGATGAAGCGAGTCAGAAGTGATCAAACm 
ACTATTTTCCGCAGGAGGATTTTCCAAAGCAAGATGTTCrCAAATGGCGCGGCCAATTAACTCCT^ 
GGTGTCAGAGGGATTGCTTCAAGTAGTAGACAAGCAAAAGCCAACCTTAGTTCATGCGGTAACAGGAGCTGGAAA 

50 GACAGAAATGATTTATCAAGTAGTGGCTAAAGTGATCAATGCGGGTGGTGCAGTGTGTTTGGCTAGTCCTCGCAT 
AGATGTTTGTTTGGAGCTGTACAAGCGCCTGCAACAGGATTTTTOT 
GAACCTTATTTTCGAACACC ACTAGT TGTTGCAACAACCCATCAGTTATTGAAGTTT^ 
GATAGTGGATGAAGTAGATCCTTTTCCrrATG^ 

GAGAATGGATTGAGAATCrTTTTAACAGCGACTTCGACCAATGAGTTAGATAAAAAGGTCCGm 
55 AAAAGACTGAATTTACCGAGACGGTTTCATGGAAATCCGTTGATTATTCCAAAACCAATTTGGTTATCGGAT^ 
ATCGCTACn TAGAC AAGAATCGTTTGTCACCAAAGTTAAAGTCCTATATTGAGAAGCAGAGAAAGACAGCTTATC 
CGTTACTCAri i riG CTTCAGAAATTAAGAAAGGGGAGCAGTTAGCAGAAATCTTACAGGAGCAATTTCCAAATC 
AGAAAATTGG(mTGTATCTTCTGTAACAGAGGATCGATTAGAGCAAGTACAAGCTTTTCGAGATC^ 
CAATACTTATCAGTACGACAATCTTGGAGCGCGGAGTTACCTTCCCTTGTGTGGATGTTTTCGTAGTAGAGGCC^ 
60 TCATCG TTTGT TTACCAAGTCTAGTTTGATTCAGATTGGTGGACGAGTTGGACGAAGCATGGATAGACCGACAGGA 
GATTTGCi ri iUi l CCATGATGGGTTAAATGCrrCAATCAAGAAGGCGATTAAGGAAATTCAGATGATGAATAAG 
AGGCTGGTCTATGA 

MKVNLDYLGRLFTENELTEEERQLAEKLPAMRKEKGKLFCQRCNSTILEEWYLPIGAYYCRECLLMKRVRSDQTLYYF 
65 PQEDFPKQDVLKWRGQLTPFQEKVSEGLLQVVDKQKPTLVHAVTGAGKTEMIYQVVAKVINAGGAVCLASPRIDVCL 
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ELYKRLQQDFSCGIALLHGESEPYFRTPLVVATTHQLLKFYQAFDLLIVDEVDAFPYVDNPMLYHAVKNSVKENGLRIF 
LTATSTNELDKKVRLGELKRLNLPRRFHGNPLIIPKPIWLSDFNRYLDKNRLSPKLKSYIEKQRKTAYPLUFASEIKKGE 
QLAEILQEQFPNEKIGFVSSVTEDRLEQVQAFRDGELTILISTTILERGVTFPCVDVFVVEANHRLFTKSSLIQIGGRVGRS 
^ MDRPTGDLLFFHDGLNASIKKAIKEIQMMNKEAGLZ 

IDXnSTObp 

ATGCAAATTCAAAAAAGTTTTAAGGGGCAGTCTCCCTATGGC^ 
CTAGATGAT^VrGACTTTTCGTGCTAT 
10 ACAGGGCri i rGCTCAAGCATTTTGACATTTCCACCAAGCAGATCAGTTTTCATGAGCACAATGCCAAGGAAA^ 

ATTCCTGATTTGATTGGTTTCrTGAAAGCAGGGCAAAGTATTGCTCAGGTCTCTGATGCCGGT^ 
CAGACCCTGGTCATGATTTAGTTAAGGCAGCTATTGAGGAAGAAATrGCAGTTGTGACAGTTCCAGGTGCCTCTGC 
AGGAATTTCTGCCTTGjVITGCCAGTGGTTTAGCGCCA 

GGTCAGCAGAAGCAATTTTTTGGCTTGAAAAAAGATTATCCTGAAACACAGATITTTTATG^^ 
15 TAGCAGACACGTTGGAAAATATGTTAGAAGTCTACGGTGACCGCTCCGTTGTCTTGGTCAGGGAATTGACCAAAA 
TCTATGAAGAATACCAACGAGGTACTATCTCTGAGTTATTAGAAAGCATTGCTGAAACGCCACTCAAGGGCGAAT 
GTCTTCTCATTGTTGAGGGTGCCAGTCAGGGTGTGGAGGAAAAGGACGAGGAAGACTTGTTCGTAGAAATTCAAA 
CCCGCATCCAGCAAGGTGTGAAGAAAAACCAAGCTATCAAGGAAGTCGCTAAGATTTACCAGTGGAATAAAAGTC 
AGCTCTACGCTGCCTACCACGACTGGGAAGAAAAACAATAA 



20 



25 



MQIQKSFKGQSPYGKLYLVATPIGNLDDMTFRAIQTLKEVDWIAAEDTRNTGLLLKHFDISTKQISFHEHNAKEKIPDLI 
GFLKAGQSIAQVSDAGLPSISDPGHDLVKAAIEEEIAVVTVPGASAGISALIASGLAPQPHIFYGFLPRKSGQQKQFFGLKK 
DYPETQIFYESPHRVADTLENMLEVYGDRSVVLVRELTKIYEEYQRGTISELLESIAETPLKGECLLIVEGASQGVEEKDE 
EDLFVEIQTRIQQGVKKNQAIKBVAKIYQWNKSQLYAAYHDWEEKQZ 

ID118 34SbD 



ATGATAAAGAAAGGAAAGGGCTGTTTTATGGACAAAAAAGAATTATTTGACGCGCTGGATGATTm 
TTATTGGTAACCTTAGCCGATGTGGAAGCCATCAAGAAAAATCTCAAGAGCCTGGTAGAGGAAAATACAGCTOT 

jU cgcttggaaaatagtaagttgcgagaacgcttgggtgaggtggaagcagatgctcctgtcaaggccaagcatgtt 
cgcgaaagtgtccgtcgtatttaccgtgatggatttcacgtatgtaatgatttttatggacaacgtcgagagcagg 
acgaagaatgtatgttttgtgacgagttgttatacagggagtaa 

mikkgkgcfmdkkelfdalddfsqqllvtladveaikknlkslveentalrlensklrerlgeveadapvkakhvres 
35 vrriyrdgfhvcndfygqrreqdeecmfcdellyrez 

ID119 639bp 

ATGT CAAA AGGATTTTTAGTCTCTCnTGAGGGACCAGAGGGAGCAGGCAAGACCAGTGTTTTAGAGGCT 
40 CCAATTTTAG AGGA AAAAGGAGTAGAGGTGTTGACGACCCGTGAACCTGGCGGAGTCTrGATTGGGGAGAAGATT 
CGGGAAGTGATTTTGGATCCAAGTCATACTCAGATGGATGCTAAAACAGAGCTACITCTCrATAT^^ 
GAGAGCATTTGGTGGAAAAAGTTCTTCCAGCCCriTGAAGCTGGCAAGTTGGTCATCATGGATCGTTT^ 
TTCTGTTGCCTATCAGGGATTTGGTCGTGGCTTAGATATTGAAGCCATTGACTGGCTCAATCAGTTTGCGACAGAT 
GGCCTCAAACCCGATTTGACACTCTATTTTGACATCGAGGTGGAAGAAGGGCTGGCTCGTATTGCTGCTAATAGTG 
43 ACCGCGAGGTTAATCGTTTGGATTTGGAAGGGTTGGACTTGCATAAAAAAGTTCGTCAAGGCTACCTT^ 

GGATAAAGAGGGAAATCGCATTGTCAAGATTGATGCTAGTCTCCCTTTGGAGCAAGTTGTGGAAACTACCAAGGC 
TGTC7TGTTTGACGGAATGGGCTTGGCCAAATGA 

MSKGFLVSLEGPEGAGKTSVLEALLPILEEKGVEVLTTREPGGVLICEKrREVILDPSHTQMDAKTELLLYIASRRQHLVE 
30 KVLPALEAGKLVIMDRFIDSSVAYQGFGRGLDIEAIDWLNQFATDGLKPDLTLYFDIEVEEGLARIAANSDREVNRLDL 
EGLDLHKKVRQGYLSLLDKEGNRIVKIDASLPLEQVVETTKAVLFDGMGLAKZ 

ID12Q Wbp 

55 ATGGTAGAACAAAGAAAATCAATTACCATGAAAGATGTTGCTTTAGAAGCAGGAGTTAGTGTTGGAACTGTTTCA 
CGTGTAATTAATAAAGAAAAAGGCATTAAAGAAGTAACTTTGAAAAAAGTGGAACAAGCGATTAAAACrn^GA^^ 
TACATTCCAG ATTACT ACGCTAGAGGAATGAAAAAAAATCGAACAGAAACGATTGCAATCATTGTACCAAGTATC 
TGGCATCCCTTCTTTTCAGAATTTGCTATGCATGTGGAAAATGAAGTCTATAAGAGAAATAACAAArrAC^ 
GTTCTATCAATGGTACAAATAGAGAGCAAGACTATCTGGAGATGTTGCGTCATAATAAAGTTGATGGAGTGGTTG 

00 CCATTACCTATAGGCCAATTGAACATTACTTGACGTCAGGAATTCCCTTTGTTAGTATTGACCGCACATACTCAGA 
GATTGCCATTCCTTGTGTTTCA 



65 



MVEQRKSITMKDVALEAGVSVGTVSRVINKEKGIKEVTLKKVEQAIKTLNYIPDYYARGMKKNRTETIAIIVPSIWHPFF 
SEFAMHVENEVYKRNNKLLLCSINGTNREQDYLEMLRHNKVDGVVAITYRPIEHYLTSGIPFVSIDRTYSEIAIPCVS 
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ATGAATATATTTAGAACAAAGAATGTTAGrrrAGATAAAACAGAGATGCATAGGCAm 
ATTTTGCTGGGTATCGGAGCCATGGTAGGGACAGGCGTCITTACAATCACAGGTACTGCAGCTGCAACACT^ 
D GCCCAGCCCTAGTGATTTCAATCGTTATTTCTGCCTTGTGTGTGGGATTATCAGCCCTCT^^ 

TCGCGAGTACCCGCTACAGGAGGTGCCTATAGTTACCTCTATGCTATCTTAGGAGAATTCCCTGCCTGGTTGGCTG 
GTTGGTTAACCATGATGGAGTTCATGACAGCCATATCAGGCGTAGCTTCGGGTTGGGCAGCTTATTTTAA 

MNIFRTKOTSLDKTEMHRHLKLWDULLGIGAMVGTGVFTITGTAAATLAGPALVISIVISALCVGl^ALFFA 
lU ATGGAYSYLYAILGEFPAWLAGWLTMMEFMTAISGVASGWAAYF 

ID124 13Ubp 

ATGAAATCAAGAGTAAAGGAAACGAGTATGGATAAAATTGTGGTTCAAGGTGGCGATAATCGTCTGGTAGGAAGC 

10 Cn-GACGATCGAGGGAGCAAAAAATGCAGTCTTACCCTTGTTGGCAGCGACTATTCTAGCAAGTGAAGGAAAGACC 
GTCTTGCAGAATGTTCCGATITTGTCGGATGTCTTTATTATGAATCAGGTAGrrGGTGGm 
ACTITGATGAGGAAGarCATCTTGTCAAGGTGGATGCTACTGGCGACATCACTGAGGAAGCCCCTTACAAGTATG 
TCAGCAAGATGCGCGCCTCCATCGTTGTATTAGGGCCAATCCTTGCCCGTGTGGGTCATGCCAAGGTATCCATGCC 
AGGTGGTTGTACGATTGGTAGCCGTCCTATTGATCTTCATTTGAAAGGTCTGGAAGCTATGGGGGTTAAGATTAGT 

ZU CAGACAGCTGGTTACATCGAAGCCAAGGCAGAACGCTTGCATGGTGCTCATATCTATATGGACTTTCCAAGTGTO 
GTGCAACGCAGAACTTGATGATGGCAGCGACTCTGGCTGATGGGGTGACAGTGATTGAGAATGCTGCGCGTGAGC 
CTGAGATTGTTGACrrAGCCATTCTCOTAATGAAATGGGAGCCAAGGTCAAAGGTGCTGGTACAGAGACTATAA 
CCATTACTGGTGTTGAGAAACTTCATGGTACGACTCACAATGTAGTCCAAGACCGTATCGAAGCAGGAACC^ 
GGTAGCTGCTGCCATGACTGGTGGTGATGTCTTGATTCGAGACGCrGTCTGGGAGCACAACCGTCCCnTGATTG^ 

Id AAGTTACTTGAAATGGGTGTTGAAGTAATTGAAGAAGACGAAGGAATTCGTGTTCGTTCrCAACT^^ 

AAAGCTGTTCATGTGAAAACCrrGCCCCACCCAGGATTTCCAACAGATATGCAGGCTCAATTTACAGCOT 
CAGTTGCAAAAGGCGAATCAACCATGGTGGAGACAGTTTTCGAAAATCGTTTCCAAACCTAGAAGAGATC 
CATGGGCTTGCATTCTGAGATTATCCGTGATACAGCTCGTATTGTTGGTGGACAGCCTTTGCAGGGAGCAGAAGT^ 
CTTTCAACTGACCTTCGTGCCAGTGCGGCCrrGATTTTGACAGGTTTGGTAGCACAGGGAGA 

JU AATTGGTTCACTTGGATAGAGGTTACTACGGTTTCCATGAGAAGTTGGCGCAGCTAGGTGCTAAGATT^ 
TGAGGCAAGTGATGAAGATGAATAA 

MKSRVKETSMDKIVVQGGDNRLVGSVTIEGAKNAVLPLLAATILASEGKTVLQNVPILSDVFIMNQVVGGLNAKVDFD 
EEAHLVKVDATGDrTEEAPYKYVSKMRASIVVLGPILARVGHAKVSMPGGCTIGSRPIDLHLKGLEAMGVKISQTAGYIE 
J5 AKAERI^GAHIYMDFPSVGATQNLMMAATLADGVTVIENAAREPEIVDLAILLNEMGAKVKGAGTETITITGVEKLHG 
TTHWVQDRIEAGTFMVAAAMTGGDVURDAVWEHNRPUAKLLEMGVEVIEEDEGIRVRSQLENLKAVHVKTLPHP 
GFPTDMQAQFTALMTVAKGESTMVETVFENRFQHLEEMRRMGLHSEIIRDTARIVGGQPLQGAEVLSTDLRASAAUL 
TGLVAQGETVVGKLVHLDRGYYGFHEKLAQLGAKIQRIEASDEDEZ 

40 ID125 UOlbp 

ATGTTATTAGCGTCAACAGTAGCCTTGTCATTTGCCCCAGTATTGGCAACTCAAGCAGAAGAAGTTC^ 

CAC GTAG TGTTGAGCAAATCCAAAACGATTTGACTAAAACGGACAACAAAACAAGTTATACCGTACAGTATGGTG 

ATACTTTG AGCA CCATTGCAGAAGCCTTGGGTGTAGATGTCACAGTGCTTGCGAATCTGAACAAAATCACTAATAT 

40 GGACTTGArnrCCCAGAAACrGTTTTGACAACGACTGTCAATGAAGCAGAAGAAGTAACAGAAGTTCAAATCCA 
AACACCTCAAGCAGACTCTAGTGAAGAAGTGACAACTGCGACAGCAGATTTGACCACTAATCAAGTGACCGTTGA 
TGATCAAACrGTTCAGGTTGCAGACCriTCTCAACCAATTGCAGAAGTTACAAAGACAGTGATTGCTTCT^ 
GTGGCACCATCTACGGGCACTTCTGTCCCAGAGGAGCAAACGACCGAAACAACTCGCCCAGTTGCAGAAGAAGCT 
CCTCAGGAAACGACTCCAGCTGAGAAGCAGGAAACACAAACAAGCCCTCAAGCTGCATCAGCAGTGGAAGCAAC 

3U TACAACAAGTTCAGAAGCAAAAGAAGTAGCATCATCAAATGGAGCTACAGCAGCAGTTTCTACTTATCAACCAGA 
AGAAACGAAAGTAATTTCAACAACnTACGAGGCTCCAGCTGCGCCCGATTATGCrGGACTTGCAGT^^ 
TGAAAATGCAGGTCTTCAACCACAAACAGCTGCCrrrAAWGAAGAAATTGCTAACTTGTTTGGCATTA 
AGTGGTTATCGTCCAGGAGACAGTGGAGATCACGGAAAAGGTTTGGCTATCGACTTTATGGTACCAGAACGTTCA 
GAATTAGGGGATAAGATTGCGGAATATGCTATTCAAAATATGGCCAGCCGTGGCATTAGTTACATCATCTGGAAA 

DD CAACGTTTCTATGCTCCATTCGATAGCAAATATGGGCCAGCTAACACTTGGAACCCAATGCCAGACCGTGGTAGT 
GTGACAGAAAATCACTATGATCACGTTCACGTTTCAATGAATGGATAA 

MLl^STVALSFAPVLATQAEEVL\VTARSVEQIQNDLTKTDNKTSYTVQYGDTl^IAEAlX3VD\n^ 
IFPETVLTTTVNEAEE\rrEVEIQTPQADSSEEVTTATADLTTNQVTVDDQTVQVADI^QPIAEVTKTVI^ 
OU VPEEQTTETTRPVAEEAPQETTPAEKQETQTSPQAASAVEATITSSEAKEVASSNGATAAVSTYQPEETKVISTTYEAPA 
APDYAGLAVAKSENAGLQPQTAAFKKKLLTCLALHPLVVIVQETVErrEKVWLSTLWYQWQNZGIRLRNMLFKIWPA 
VALVTSSGNNVSMLHSIANMGQLTLGTQCQTVVVZQKITMITFTFQZMD 

65 
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ID126 |281bp 

5 TTGTrTAAGAAAAATAAAGACATTCTTAATATTGCATTGCCAGCTATGGGTGAAAACTT^ 
GAATGGTGGACAGTTATTTGGTTGCTCATTTAGGATTGATAGCTATrrCAGGGGT^ 
CACCATTTATCAGGCGATTTTCATCGCTCTGGGAGCTGCTATTTCCAGTGTTATTTCAA^ 
GACCAGTCGAAGTTGGCCTATCATGTGACTGAGGCGTTGAAGATTACCTTACTATTAAGTTTCCT^ 
TGTCCATCnrCGCTGGGAAAGAGATGATAGGACTTTTGGGGACGGAGAGGGATGTAGCTGAGAGTGGTGGACTGT 

10 ATCTATCmGGTAGGCGGATCGATTGTTCTCITAGGTTTAATGACrAGTCTAGGAGCOT 
TAATCCACGTCTGCCTCTCTATGTTAGTTTTTTATCCAATGCCTTGAATATTC^^ 
TCTGGATATGGGGATAGCTGGTGTTGCTTGGGGGACAATTGTGTCTCGTTTGGTTGGTCTTGTGAT^ 
AATTAAAACTGCCTTATGGGAAGCCAACTTTTGGTTTAGATAAGGAACTGTTGACCTTGGCm 
AGAGCGACTTATGATGAGGGCrGGAGATGTAGTGATCATTGCCITGGTCGTTT<nTI^ 

15 GGGAATGCAATCGGAGAAGTCTTGACCCAGTTTAACTATATGCCTGCCTTTGGCGTCGCTACGGCAACGGTCAT^ 
CTGTTGGCCCGAGCAGTTGGA GAGG ATGATTGGAAAAGAGTTGCTAGTTTGAGTAAACAAACCrm 
TGTTCCTCATGTTGCCCCTGTCCmAGTATATATGTCTTGGGTGTACCATTAACTCATCTCTATACGACTGATT^ 
CTAGCGGTGGAGGCTAGTGTTCTAGTGACACTGTTTTCACTACTTGGGACCCCTATGACGACAGGAACAGTCAT 
ATACGGCAGTCTGGCAGGGATTAGGAAATGCACGCCrCCCTTTTTATGCGACAAGTATAGGAATGTGGTGTATCC 

20 GCATTG GGAC AGGATATCTGATGGGGATTGTGCTTGGTTGGGGCTTGCCTGGTATTTGGGCAGGGTCTCTC^ 
TAATGGTTTTCGCTGGTTATTTCTACGCTATCGTTACCAGCGCTATATGAGCTTGAAAGGATAG 

LFKKNKDILNIALPAMGENFLQMLMGMVDSYLVAHLGLIAISGVSVAGNirriYQAIFULGAAlSSVISKSIGQKDQSKLA 
YHVTEALKnXLLSFLLGFLSIFAGKEMIGLLGTERDVAESGGLYI^LVGGSWLLGLMTSLGAURATHNPRLPLYVSFL 
25 SNALNILFSSLAIFVLDMGIAGVAWGTIVSRLVGLVILWSQLKLPYGKPTFGLDKELLTLALPAAGERLMMRAGDVVIIA 
LWSFGTEAVAGNAIGEVLTQFNYMPAFGVATATVMLLARAVGEDDWKRVASLSKQTFWLSLFLMLPLSFSIYVLGVP 
LTHLYTTDSLAVEASVLVTLFSLLGTPMTTGTVIYTAVWQGLGNARU'FYATSIGMWCIIUGTGYLMGIVLGWGLPGIW 
AGSLLDNGFRWLFLRYRYQRYMSLKGZ 

30 10127 894bD 

GTGGGA AGAAT TATCAGAGCAGGTGTAAAGATGGAACATCTTGGAAAAGTATrrCGTGAATTTCGAACAAGTGGA 
AATTATTCITTAAAGGAAGCAGCAGGCGAATCCT^ 

ACC TGGC AGTCTCCCGTTTCTTTGAGATTTTGGATAACATTCATGTAACAATCGAAAATT^ 
35 GAATITrCATAATCATGAACATGTGTCTATGATGGCACAGATTATCCCACTTTACTATTC/^ 
TTTC AAAAGCTTCAAAGAGAACAACTTGAAAAGTCTAAGAGTrCGACGACTCCCCTTTATT^ 
TTTTGCTACAAGGTCTGATTTGTCAAAGAGATGCGAGTTATGATATGAAGCAGGATGATrrGGGTAAGGTAG^^^ 
ATTATCTCrrCAAAACAGAAGAATGGACCATGTATGAGTTGATTCrmCGGTAACCTCTATAGm 
AGACTATGTC ACTCG GATTGGTAGAGAAGTTATGGAGAGGGAGGAATTTTACCAAGAGATTAGTCGCCATAAGAG 
40 ATTAGTGTTGATTrrGGCCCTCAATTGTTACCAGCATTGTTTAGAGCATTCTTCnTm 

AGGCTTATACAGAGAAGATTATTGACAAAGGTATTAAGCTTTATGAGCGTAATGTTTTCCATTAT^ 

TGCCTTATATCAAAAAGGACAGTGTAAAGAAGGCTGTAAGCAGATGCAAGAGGCCATGCATATTTTTGATGTGTT 

AGGTCTrCCAGAGCAAGTAGCCTATTATCAGGAACACTACGAAAAATTTGTCAAAAGTTAA 

45 VGRIIRAGVKMEHLGKVFREFRTSGNYSLKEAAGESCSTSQLSRFELGESDLAVSRFFEILDNIHVTIENFMDKARNFHN 
HEHVSMMAQIIPLYYSNDIAGFQKLQREQLEKSKSSTTPLYFELNWILLQGLICQRDASYDMKQDDLGKVADYLFKTEE 
WTMYELILFGNLYSFYDVDYVTRIGREVMEREEFYQEISRHKRLVLILALNCYQHCLEHSSFYNANYFEAYTEKIIDKGI 
KLYERNVFHYLKGFALYQKGQCKEGCKQMQEAMHIFDVLGLPEQVAYYQEHYEKFVKSZ 
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TABLE 3 
IDl 1068bD 

5 ATGTCTAAC ATTCA AAACATGTCCCTGGAGGACATCATGGGAGAGCGCTTTGGTCGCTACTCCAAGTACATTAT^^ 
AAGACCGGGCTTTGCCAGATATTCGTGATGGGTTGAAGCCGGTTCAGCGCCGTATTCTTTATTCTATGAATAAGGA 
TAGCAATACTTTTGACAAGAGCTACCGTAAGTCGGCCAAGTCAGTCGGGAACATCATGGGGAATTTCCACCCACA 
CGGGGATTCTTCTATCTATGATGCCATGGTTCGTATGTCACAGAACTGGAAAAATCGTGAGATTCTAGTTGAAATG 
CACGGTAATAACGGTTCTATGGACGGAGATCCTCCTGCGGCTATGCGTTATACTGAGGCACGTTTGTCTGAAATTG 

10 CAGGCTACCTTCTTCAGGATATCGAGAAAAAGACAGTTCCTTTTGCATGGAACTTTGACGAT^ 

CAACGGTCTTGCCAGCAGCCTTTCCAAACCTCITGGTCAATGGTTCGACTGGGATTTCGGCTGGT^^ 
CATTCCTCCCCATAATTTAGCTGAGGTCATAGATGCTGCAGTTTACATGATTGACCACCCAACTGCAAAGATTGAT 
AAACTCATGGAATTCITGCCTGGACCAGACTrCCCTACAGGGGCTATTATTCAGGGTCGTGATGAAATCAAG^ 
GCTTATGAGACTGGGAAAGGGCGCGTGGTTGTTCGTTCCAAGACTGAAATTGAAAAGCTAAAAGGTGGTAAGGAA 

15 CAAATCGTTATTATTGAGATTCCTTATGAAATCAATAAGGCCAATCTAGTCAAGAAAATCGATGATGTTCGTGTTA 
ATAACAAGGTAGCTGGGATTGCTGAGGTTCGTGATGAGTCTGACCGTGATGGTCTTCGTATCGCTATCGAACTTAA 
GAAAGACGCTAATACTGAGCTTGTTCTCAACTACTTATTTAAGTACACCGACCTACAAATCAACT^ 
ATGGTGGCGATTGACAArrrCACACCrCGTCAGGTTGGATTGTTCCAATCCTGTCTAGCTATATCGCTCACCGTCG 
AGAAGTGA 

20 

MSNIQNMSLEDIMGERFGRYSKYIIQDRALPDIRDGLKPVQRRILYSMNKDSNTFDKSYRKSAKSVGNIMGNFHPHGDS 
SIYDAMVRMSQ^nVKNmLVEMHGNNGSMIXJDPPAAMRYTEARl^EIAGYLLQDIEKKTVPFAWNFDDTEKEPT^ 
AAFPNLLVNGSTGISAGYATDIPPHNLAEVIDAAVYMIDHPTAKIDKLMEFLPGPDFPTGAIIQGRDEDCKAYETGKGRV 
VVRSKTEIEKLKGGKEQIVIIEIPYEINKANLVKKIDDVRVNNKVAGIAEVRDESDRDGLRIAIELKKDANTELVLNYLFK 
25 YTDLQINYNFNMVAIDNFTPRQVGLFQSCLAISLTVEKZ 

1D12 684bD 

ATGCCGACATTAGAAATAGCACAAAAAAAACTGGAGTTCATTAAGAAGGCAGAAGAATATTACAATGCCTTGTGT 
30 ACAAATATACAGTTGAGCGGAGATAAACTAAAAGTAATITCCGTTACTTCTGTTAACCCTGGGGAAGGAAAAACA 
ACTACnTCCATAAATATAGCATGGTCGTTTGCGCGTGCAGGCTATAAAACTCTTTTGATCGATG 
ATTCAGTTATGTTAGGAGTTTTTAAATCTCGTGAAAAAATTACAGGGCTAACAGAATT^ 
TTTATCTCACGGTTTATGTGATA CAAAT ATTGAAAATTTATTTGTAGTTCAATCGGGATCTGT^^ 
ACAGCCTTGTTACAAAGTAAAAATTTTAATGATATGATTGAAACATTGCGTAAATATTTTGATTATATCA^^ 
35 ATACACCGCCTATTGGAATTGTTATTGATGCGGCAATTATCACTCAAAAGTGTGATGCGTCCATCTTGGTAACAGC 
AACAG GTGAG GCGAATAAACGTGATATCCAAAAAGCGAAACAACAATTAAAACAAACAGGGAAACTGTTCCTAG 
GAGTTGTTTTAAATAAATTGGATATCTCGGTTAATAAGTATGGAGTTTACGGTTCCTATGGAAATTATGGTAA^ 
ATAA 

40 MPTLEIAQKKLEFIKKAEEYYNALCrmQI^GDKLKVISVTSVNPGEGKTTTSINIAWSFARAGYKTLLIDGDTRNSVML 
GVFKSREKITGLTEFLSGTADLSHGLCDTNIENLFVVQSGSVSPNPTALLQSKNFNDMIETLRKYFDYIIIDTPPIGIVIDAA 
irrQKCDASILVTATGEANKRDIQKAKQQUCQTGKLFLGVVLNKLDISVNKYGVYGSYGNYGKKZ 

ID13 1182bD 

45 

/O^GGAGGCAAATATGAAACATCTAAAAAC^ 

TTTTTAGTGGAGCCTTGGGTAGTTTTTCAATAACTCAACTAACTCAAAAAAGTAGTC^ 

TAGTACTATTACACAAACTGCCTATAAGAACGAAAATTCAACAACACAGGCTGTTAACAAAGTAAAAGATGCTGT 
TGTTTCTGTTATTACITATTCGGCAAACAGACAAAATAGCGTATTTGGCAATGATGATACTGACACAGAT^ 

50 CGAATCTCTAGTGAAGGATCTGGAGTTATTTATAAAAAGAATGATAAAGAAGCTTACATCGTCACCAACAATCAC 
GTTA TTAAT GGCGCCAGCAAAGTAGATATTCGATTGTCAGATGGGACTAAAGTACCTGGAGAAATTGTCGGAGCT 
GACACTTTCTCTGATATTGCTGTCGTCAAAATCTCTTCAGAAAAAGTGACAACAGTAGCTGAGTTTGGTGATTCT 
GTAAGTTAACTGTAGGAGAAACTGCTATTGCCATCGGTAGCCCGTTAGGTTCTGAATATGCAAATACTGTCACTCA 
AGGTATCGTATCCAGTCTCAATAGAAATGTATCCnTAAAATCGGAAGATGGACAAGCTATTTCTACAAAAGCCAT 

55 CCAAACTGATACTGCTATTAACCCAGGTAACTCTGGCGGCCCACTGATCAATATTCAAGGGCAGGTTATCGGAAT 
TACCrCAAGTAAAATTGCTACAAATGGAGGAACATCTGTAGAAGGTClTGGTTTCGCAATTCCTGCAAATGATGCT 
ATCAATATTATTGAACAGTTAGAAAAAAACGGAAAAGTGACGCGTCCAGCTTTGGGAATCCAGATGGTTAATTTA 
TCTAATGTGAGTACAAGCGACATCAGAAGACTCAATATTCCAAGTAATGTTACATCTGGTGTAATTGTTCGTTCGG 
TACAAAGTAATATGCCTGCCAATGGTCACCTTGAAAAATACGATGTAATTACAAAAGTAGATGACAAAGAGATTG 

60 CTTCATCAACAGACTTACAAAGTGCTCTTTACAACCATTCTATCGGAGACACCArrAAGATAA^ 
CGGGAAAGAAGAAACTACCTCTATCAAACTTAACAAGAGTTCAGGTGATTTAGAATCTTAA 

MEANMKHLKTFYKKWFQLLVVIVISFFSGALGSFSrrQLTQKSSVNNSNNNSmQTAYKNENSTTQAVNKVKDAVVSV 
ITYSANRQNSVFGNDDTDTDSQRISSEGSGVIYKKNDKEAYIVTNNHVINGASKVDIRLSDGTKVPGEIVGADTFSDIAV 
65 VKISSEKVTTVAEFGDSSKLTVGETAIAIGSPLGSEYANTTVTQGIVSSLNRNVSLKSEDGQAISTKAIQTDTAINPGNSGGP 
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LINIQGQVIGITSSKIATNGGTSVEGLGFAIPANDAINIIEQLEKNGKVTRPALGIQMVNLSNVSTSDIRRLNIPSNVTSGVIV 
RSVQSNMPANGHLEKYDVITKVDDKEIASSTDLQSALYNHSIGDTIKITYYRNGKEETTSIKLNKSSGDLESZ 

^ ID15232bE 

ATGGCAGAAATTTATCTAGCAGGTGGTTGTTTTTGGGGCCTAGAGGAATATTTTTCACGCAm 
AAACCAGTGTTGGCTACGCTAATGGTCAAGTCGAAACGACCAATTACCAGTTGCTCAAGGAAACAGACCATGCAG 
AAACGGTCCAAGTGATTTACGATGAGAAGGAAGTGTCACTCAGAGAGATTTTACnTrATTATTTCCGA 
TCCTCTATCTATCAATCAACAAGGGAATGACCGTGGTCGCCAATATCGAACTGGGATTTATTATCAGGATGAAGC 

10 AGATTTGCCAGCTATCTACACAGTGGTGCAGGAGCAGGAACGCATGCTGGGTCGAAAGATTGCAGTAGAAGTGGA 
GCAATTACGCCACTACATTCTGGCTGAAGACTACCACCAAGACTATCTCAGGAAGAATCCTTCAGGTTACTGTCAT 
ATCGATGTGACCGATGCTGATAAGCCATTGATTGATGCAGCAAACTATGAAAAGCCTAGTCAAGAGGTGTTGAAG 
GCCA GTCT ATCTGAAGAGTCTTATCGTGTCACACAAGAAGCTGCTACAGAGGCTCCATTTACCAATGCCTATGACC 
AAACCTTTGAAGAGGGGATTTATGTAGATATTACGACAGGTGAGCCACT 

15 AGGTTGTGGTTGGCCAAGTTTTAGCCGTCCGATTTCCAAAGAGTTGATTCATTATTACAAGGATCTGAGCC 

ATGGAGCGAATTGAAGTTCGTTCrCGTTCAGGCAGTGCTCACTTGGGTCATGTTTTCACAGATGGACCGCGGGAGT 

TAGGCGGCCTCCGTTACTGTATCAATTCTGCTTCTTTACGCTTTGTGGCCAAGGATGAGATGGA^ 

TGGCTATCTATTGCCTTACTTAAACAAATAA 

20 MAEIYLAGGCFWGLEEYFSRISGVLETSVGYANGQVETTNYQLLKETDHAETVQVIYDEKEVSLREILLYYFRVIDPLSI 
NQQGNDRGRQYRTGIYYQDEADLPAIYTVVQEQERMLGRKIAVEVEQLRHYILAEDYHQDYLRKNPSGYCHIDVTDA 
DKPUDAANYEKPSQEVLKASLSEESYRVTQEAATEAPFTNAYDQTFEEGIYVDITTGEPLFFAKDKFASGCGWPSFSRPI 
SKELIHYYKDLSHGMERJEVRSRSGSAHLGHVFTDGPRELGGLRYCINSASLRFVAKDEMEKAGYGYLLPYLNK2 

25 ID17 870bD 

ATGAAGATTATTGTACCTGCAACCAGTGCCAATATCGGGCCAGGTmGACTCGGTCGGTGTAGCTGTAACCAAGT 
ATCTTCAAATTGAGGTCTGCGAAGAAC GAGA TGAGTGGCTGATTGAACACCAGATTGGCAAATGGATTCCACATG 
ACGAGCGT AATCT CTTGCTCAAAATCGCmGCAAATTGTACCAGACTTGCAACCAAGACGCTTGAAAATG^^ 

6K) GTGATGTCCCTTTGGCGCXjCGGTTTGGGTTCTTCCAGCTCGGTTATCGTTGCTGGGATTGAACTA 

GGGTCAACTCAACTTATCAGACCATGAAAAATTGCAGTTAGCGACCAAGATTGAAGGGCATCCTGACAATGTGGC 
TCCAGCCA TTTA TGGTAATCTCGTTATTGCAAGTTCTGTTGAAGGGCAAGTCTCTGCTATCGTAGCAGAC^^ 
GAGTGTGATTTTCTAGCTTACATTCCAAACTATGAATTACGTACTCGCGACAGCCGTAGTGTCTTGCCTAAAA 
TGTCTTATAAGGAAGCTGTTGCTGCAAGTTCTATCGCCAATGTAGCGGTTGCTGCCnTGTTGGCAGGAGACATGGT 

35 GACCGCTGGGCAAGCAATCGAGGGAGACCTCTTCCATGAGCGCTATCGTCAGGACTTGGTAAGAGAATTTGCGAT 
GATTAAGCAAGTGACCAAAGAAAATGGGGCCTATGCAACCTACCTTTCTGGTGCTGGGCCGACAGTTATGGTTCT 
GGCTTCTCATGACAAGATGCCAACAATTAAGGCAGAATTGGAAAAGCAACCTTTCAAAGGAAAACTGCAT^ 
GAGAGTTGATACCCAAGGTGTCCGTGTAGAAGCAAAATAA 

40 MKIIVPATSANIGPGFDSVGVAVTKYLQIEVCEERDEWLIEHQIGKWIPHDERNLLLKIALQIVPDLQPRRtKMTSDVPLA 
RGLGSSSSVIVAGIELANQLGQLNLSDHEKLQLATKIEGHPDNVAPAIYGNLVIASSVEGQVSAIVADFPECDFLAYIPNY 
ELRTRDSRSVLPKKLSYKEAVAASSIANVAVAALLAGDMVTAGQAIEGDLFHERYRQDLVREFAMIKQVTKENGAYAT 
YLSGAGPTVMVLASHDKMPTIKAELEKQPFKGKLHDLRVDTQGVRVEAKZ 

45 ID2Q 564bD 

ATGAAATATCACGATTACATCTGGGATTTAGGTGGAACrrTACrGGATAATTATGAAACTTCAACAGCT^ 
TTGAAACATTGGCACTGTATGGTATCACACAAGACCATGACAGTGTCTATCAAGCTITAAAGGTTTCTACTCCm 
TGCGATTG AGAC ArrCGCTCCCAATTTAGAGAATTTTTTAGAAAAGTACAAGGAAAATGAAGCCAGAG 
5U ACACCCGATTTTATTTGAAGGAGTTTCTGACCTATTGGAAGACATTTCAAATCj^ 

tctcatcgaa atga tcaggttttggaaattttagaaaaaacctctatagcagcttatt^ 
ctagctcaggctttaagagaaagccaaatcccgaatccatgctttatttaagagaaaagtatcagattagct 
gtcttgtcattggtgatcggccgattgatatcgaagcaggtcaagctgcaggacttgatacccacttgm 
tatcgtgaatttaagacaagtattagacatataa 



55 

mkyhdyiwdlggtlldnyetstaafvetlalygitqdhdsvyqalkvstpfaietfapnlenflekykenearelehpi 
lfegvsdtledisnqggrhflvshrndqvlellektsiaayftevvtsssgfkrkpnpesmlylrekyqissglvigdrpid 
ieagqaagldthlftsivnlrqvldiz 

60 ID2^ 187jibp 

ATGACAGAAGAAATCAAAAATOXjCAGGCACAGGATTATGATGCCAGTCAAATTCAAGTTTTAGAGGGC^ 
GCTGTTCGTATGCGTCCAGGGATGTACATTGGATCAACCTCAAAAGAAGGTCTTCACCATCTAGTCTGGG 
TTGATAACTCAATTGACGAGGCCTTGGCAGGATTTGCCAGCCATATTCAAGTTTTTATTGAGCCAGATGAT^^ 
o5 TACTGTTGTGGATGATGGGCGTGGTATCCCAGTCGATATTCAGGAAAAAACAGGCCGTCCTGCTGTTGAGACCGT 
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CTTTACAGTCCTTCACGCTGGAGGAAAGTTCGGCGGTGGTGGATACAAGGTTTCAGGTGG 
GTCGTCAGTAGTTAATGCCCTITCCACTCAATTAGACGTTCATGTTCACAAAAATGGTAAGATTCATTACC^ 
TACCGTCGTGGTCATGTTGTCGCAGATCTTGAAATAGTTGGAGATACGGATAAAACAGGAACAACTGTTCACTTC 
AC ACCG GACCCAAAAATCTTCACTGAAACAACAATCTTTGATTTTGATAAATTAAATAAACGGATO 
5 GCCTTTCTAAATCGCGGTCTTCAAATTTCAATTACAGATAAGCGCCAAGGTTTGGAACAAACCA^ 

ATGAAGGTGGGATTGCTAGTTACGTTGAATATATCAACGAGAACAAGGATGTAATCTTTGATACACCA^^ 
CAGACGGTGAGATGGATGATATCACAGTTGAGGTAGCCATGCAGTACACAACTGGTTACCATGAAAATGTCATGA 
GTTTCGCCAATAATATTCATACCCATGAAGGTGGAACACATGAACAAGGTTTCCGTACAGCCTTGACACGTGTTAT 
CAACGATTATGCTCGTAAAAATAAGTTACTGAAAGACAATGAAGATAATTTAACAGGGGAAGATGTTCGCGAAGG 
lU CTTAACTGCAGrrATCrCAGTTAAACACCCAAATCCACAGTTTGAAGGACAAACCAAGACCAAATTGGGAAATAG 
CGAAGTGGTCAAGATTACCAATCGCCTCnrCAGTGAAGCTTTCTCCGA 

AAACGTATCGTAGAAAAAGGAATTTTGGCTGCCAAGGCTCGTGTGGCTGCCAAGCGTGCGCGTGAAGTCACACGT 
AAAAAATCTGGTTTGGAAATTTCCAACCTTCCAGGGAAACTAGCAGACTGTTCTTCTAATAACCCTGCTGAAACAG 
AACTCTTCATCGTCGAA GGAG ACTCAGCTGGTGGATCAGCCAAATCTGGTCGTAACCGTGAGTTTCAGGCTATCCT 

15 T CCAAT TCGCGGTAAGATTTrGAACGTTGAAAAAGCAAGTATGGATAAGArrCTAGCCAACGAAGAAATTCGTAG 
TCTTTTCACAGCCATGGGAACAGGATTTGGCGCAGAATTTGATGTTTCGAAAGCCCGTTACCAAA^ 
ATGACCGATGCCGATGTCGATGGAGCCCACATTCGTACCCiU'ClUMTAACCTTGATTTATCGTTATATGAAACCAA 
TCCTAGAAGCTGGTTATGTTTATATTGCCCAACCACCAATCTATGGTGTCAAGGTTGGAAGCGAGATTAAAGA^^ 
TATCCAGCCGGGTGCAGATCAAGAAATCAAACTCCAAGAAGCTTTAGCCCGTTATAGTGAAGGTCGTACCAAACC 

20 GACTATTCAGCGTTATAAGGGGCTAGGTGAAATGGACGATCATCAGCTGTGGGAAACAACCATGGATCCCGAACA 
TCGCTTGATGGCTAGAGTTTCTGTAGATGATGTGCAGAAGCAGATAAAATCTTTGATATGTTGA 

MTEElKNLQAQDYDASQIQVLEGLEAVRMRPGMYIGSTSKEGLHHLVWEIVDNSIDEALAGFASHIQVFIEPDDSrrVVD 
DGRGIPVDIQEKTGRPAVETVFTVLHAGGKFGGGGYKVSGGLHGVGSSVVNALSTQLDVHVHKNGKIHYQEYRRGHV 

25 VADLEIVGDTDKTGTTVHFTPDPKIFTETTIFDFDKLNKRIQELAFLNRGLQISITDKRQGLEQTKHYHYEGGIASYVEYI 
NENKDVIFDTPIYTDGEMDDITVEVAMQYTTGYHENVMSFANNIHTHEGGTHEQGFRTALTRVINDYARKNKLLKDN 
EDNLTGEDVREGLTAVISVKHPNPQFEGQTKTKLGNSEVVKITNRLFSEAFSDFLMENPQIAKRIVEKGILAAKARVAAK 
RAREVTRKKSGLEISNLPGKLADCSSNNPAETELFIVEGDSAGGSAKSGRNREFQAILPIRGKILNVEKASMDKILANEEI 
RSLFTAMGTGFGAEFDVSKARYQKLVLMTDADVDGAHIRTLLLTLIYRYMKPILEAGYVYIAQPPIYGVKVGSEIKEYI 

30 QPGADQEIKLQEAIj\RYSEGRTKPTIQRYKGLGEMDDHQLWETTMDPEHRLMARVSVDDVQKQIKSUCZ 

IDS4 1446bD 

ATGAGTAGAOri TrTAAAA AATCACGTTCACAGAAAGTGAAGCGAAGTGrrAATATAGriT^ 
i5 TATTGTTAGTTTGTTTTTTATTGTTCTTAATCTTTAAGTACAATATCC^ 

CTGCGTTAGTCCTACTAGTTGCCTTGGTAGGGCTACTCTTGATTATCTATAAAAAAGCTGAAAAGm 
CTGTTGGTGTTCTCTATCCTTGTCAGCTCTGTGTCGCTCTTTGCAGTACAGCAGTTTG^^ 

AAATGCGACTTCTAATTACTCAGAATATTCAATCAGTGTCGCTGTTTTAGCAGATAGTGAGATCGAAAATGTTACG 
CAACTGACGAGTGTGACAGCACCGACTGGGACTAATAATGAAAATATTCAGAAATTACTAGCTGATATCAAGTCA 

40 AGTCAGAATACCGATTTGACGGTCAACCAGAGTTCGTCnTACTTGGCAGCITACAAGAGTrrGA^^ 

ACTAAGGCCATTGTCCTAAATAGTGTCnrrGAAAACATCATCGAGTCAGAGTATCCAGACTACGCATCGAAGATA 
AAAAAGATTTATACTAAGGGATTCACTAAAAAAGTAGAAGCTCCTAAGACGTCTAAGAGTCAGTCTTTCAAT^^ 
TATGTTAGTGGAATTGACACCTATGGTCCTATTAGTTCGGTGTCGCGATCAGATGTCAACATCCTGATGACTGTCA 
ATCGAGATACCAAGAAAATCCTCTTGACCACAACGCCACGTGATGCCTATGTACCAATCGCAGATGGTGGAAATA 

45 ATCAAAAAGATAAATTGACTCATGCGGGCATTTATGGAGTTGATTCGTCCATTCACACCITAGAAAAT^ 
AGTGGATATCAATTACTATGTGCGATTGAACTTCACTTCGTTTTTGAAATTGATTGAm 
TTTATAATGATCAAGAATTTACTGCCCATACGAATGGAAAGTATTACCCTGCAGGCAATGT^^ 
ACAGGCrCTCGGTTTTGTTCGTGAGCGCTACTCCCTAGCAGATGGCGATCGTGACCGCGGGCGCCATCAACAA^ 
GGTGATTGTGGCTATCCITCAAAAATTAACGTCAACCGAAGTGCTGAAAAATTATAGTACGATCATTAATAGOT 

50 CAAGATTCTATCCAAACAAATATGCCACTTGAGACCATGATAAATTTGGTCAATGCTCAGTTAGAAAGTGGAGGG 
AATTATAAAGTAAATTCTCAAGATTTAAAAGGGACAGGTCGGATGGATCTTCCTTCTTATGCAATGCCAGACAGT^ 
ACCTCTATGTGATGGAAATAGATGATAGTAGTTTAGCTGTAGTTAAAGCAGCTATACAGGATGTGATGGAGGGTA 
GATGA 

55 MSRRFKKSRSQKVKRSVNIVLLTIYLLLVCFLLFLIFKYNILAFRYLNLVVTALVLLVALVGLLLIIYKKAEKFTIFLLVFS 
ILVSSVSLFAVQQFVGLTNRLNATSNYSEYSISVAVLADSEIENVTQLTSVTAPTGTNNENIQKLLADIKSSQNTDLTVNQ 
SSSYLAAYKSUAGETKAIVLNSVFENIIESEYPDYASKKKIYTKGFTKKVEAPKTSKSQSFNIYVSGIDTYGPISSVSRSDV 
NILMTVNRDTKKILLTTTPRDAYVPIADGGNNQKDKLTHAGIYGVDSSIHTLENLYGVDINYYVRLNFTSFLKLIDLLGG 
IDVYNDQEFTAHTNGKYYPAGNVHLDSEQALGFVRERYSLADGDRDRGRHQQKVlVAILQKLTSTEVLKNYSrnNSLQ 

00 DSIQTNMPLETMINLVNAQLESGGNYKVNSQDLKGTGRMDLPSYAMPDSNLYVMEIDDSSLAVVKAAIQDVMEGRZ 
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ATGATAGACATCCATTCGCATATCGTTTTTGATGTAGATGACGGTCCCAAGTCAAGAGAGGAAAGCAAGGCTCTC 
TTGGCAGAATCCTACAGACAGGGGGTGCGAACCATTGTTTCTACCTCTCACCGTCGCAAGGGCATGTTTG^ 
CGGAAGAGAAGATAGCAGAAAACmCTTCAGGTTCGGGAAATAGCTAAGGAAGTGGCGAGTGACTTGGTCATTG 
CTTACGGGGCTGAAATTTATTACACACCAGATGTTCTGGATAAGCTGGAAAAAAAGCGGATTCCGACCCT 
D ATAGTCGTTATGCCTTGATAGAGTTTAGTATGAACACTCCTTATCGCGATATTCATAGCGCCnTGAGCAAG^^ 
GATGTTGGGAATTACTCCAGTCATTGCCCACATTGAGCGCTATGATGCTCTTGAAAATAATGAAAAACGCGTO 
GAACTGATCGATATGGGCTGTTACACGCAAGTAAATAGrrCACATGTCCrCAAACCCAAACTTTTTGGC^ 
ATAAATTCATGAAAAAAAGAGCTCAGTATTTTTTAGAGCAGGATTTGGTTCATGTCATTGCAAGTGAT^^^ 
TCTA GACGGT AGACCTCCTCATATGGCAGAAGCATATGACCTTGTTACCCAAAAATACGGAGAAGCGAAGGCTCA 
1 U GGAACTTmATAGACAATCCTCG AAAAATTGTAATGGATCAACTAATTTAG 

MIDIHSHIVFDVDDGPKSREESKALLAESYRQGVRTIVSTSHRRKGMFETPEEKIAENFLQVREIAKEVASDLVIAYGAEI 
YYTPDVLDKLEKKRIPTLNDSRYALIEFSMNTPyRDIHSALSKILMLGITPVIAHIERYDALENNEKRVRELIDMGCYTQV- 

NSSHVLKPKLFGERYKFMKKRAQYFLEQDLVHVIASDMHNLDGRPPHMAEAYDLVTQKYGEAKAQELFIDNPRKIVM 
15 DQLIZ 

yDS8 3??Qbp 

TTGA TTTATATAATCGCTATCAATATAACAATGCAATCAGGAGGTTTTGCAATGAAACATGAAAAACA^^ 

20 TTTTCTATTCGTAAATACGCTGTAGGAGCAGCTTCTGTTCTAATTGGATTTGCCTTCCAAGC^ 

CCGATGGAGTTACTCCTACTACTACAGAAAACCAACCGACCATCCATACGGTTTCTGATTCCCCTCAATCATCCGA 
AAATCGGACTGAGGAAACACCTAAAGCAGTGCTTCAACCAGAAGCTCCAAAAACTGTAGAAACAGAAACTCCAG 
CTACTGATAAGGTAGCTAGTCTTCCAAAAACAGAAGAAAAACCACAAGAGGAAGTTAGTTCAACTCCTAGTGATA 
AAGCAGAAGTGGTAACTCCAACTTCTGCTGAAAAAGAAACTGCTAATAAAAAGGCAGAAGAAGCTAGCCCT 

25 AAGGAAGAAGCGAAAGAGGTTGATTCTAAAGAGTCAAATACAGACAAGACTGACAAGGATAAACCAGCTAAAAA 
AGATGAAGCGAAAGCAGAGGCTGACAAACCGGCAACAGAGGCAGGAAAGGAACGTGCTGCAACTGTAAATGAAA 
AACTAGCGAAAAAGAAAATTGTTTCTATTGATGCTGGACGTAAATATTTCTCACCAGAACAGCTCAAGGA^ 
TCGATAAAGCGAAACATTATGGCTACACTGATTTACACCTATTAGTCGGAAATGATGGACTCCGrrrCATGTTG 
CGATATGAGCATCACAGCTAACGGCAAGACCTATGCCAGTGACGATGTCAAACGCGCCATTGAAAAAGGTACAAA 

30 TGATTATTACAACGATCCAAACGGCAATCACTTAACAGAAAGTCAAATGACAGATCTGATTAACTATGCCA^ 

TAAAGGTATCGGTCTCAT TCCGA CAGTAAATAGTCCTGGACACATGGATGCGATTCTCAATGCCATGAAAGAATT 
GGGA ATCCAA AACCCTAACTITAGCTATTTTGGGAAGAAATCAGCCCGTACn'GTCGATCTTG 
TGTCGCTTTTACAAAAGCCCTTATCGACAAGTATGCTGCnTArrTCGCGAAAAAGACTGAA^ 
CTTGATGAATATGCCAATGATGCGACAGATGCTAAAGGTTGGAGTGTGCTTCAAGCTGATAAATACTATCCAAAC 

35 GAAGGCTACCCTGTAA AAGGC TATGAAAAATTTATTGCCTACGCCAATGACCTCGCTCGTATTGTAAAATCGCAC 
GGTCTCAAACCAATGGCTTTTAACGACGGTATCTACTACAATAGCGACACAAGCTTTGGTAGTTT^ 
ATCATCGTTTCTATGTGGACTGGTGGrrGGGGAGGCTACGATGTCGCTTCTTCTAAACTACTAGCT^ 
ACCAAATCCTTAATACCAATGATGCTTGGTACTACGTTCrTGGACGAAACGCTGATGGCCAAGGCTGGTACAA^ 
CGATCAGGGGCrCAATGGTATTAAAAACACACCAATCACTTCTGTACCAAAAACAGAAGGAGCTGATATCCCAAT 

40 CATCG GTGG TATGGTAGCTGCTTGGGCTGACACTCCATCTGCACGTTATTCACCATCACGCCTCTTCAAACTCATG 
CGTCATTTTGCAAATGCCAACGCTGAATACTTCGCAGCTGATTATGAATCTGCAGAGCAAGCAC^ 
CCAAAAGACCTGAACCGTTATACTGCAGAAAGCGTCACGGCCGTAAAAGAAGCTGAAAAAGCTATTCGCTCTCTC 
GATAGCAACCTTAGCCGTGCCCAACAAGATACGATTGATCAAGCCATTGCTAAACTTCAAGAAACTGTCAACAAC 
TTGACCCTCACGCCTGAAGCTCAAAAAGAAGAAGAAGCTAAACGTGAGGTTGAAAAACrrGCCAAAAACAAGG^ 

45 AATCTCAATCGATGCTGGACGCAAATACTTTACTCTGAACCAGCTCAAACGCATCGTAGACAAGGCCAGTGAGCT 
CGGATATTCTGATGTCCATCTCCTTCrAGGAAATGACGGACTTCGCTTTCTACTCGATGAT^^ 
AACGGAAAAACCTATGCTAGTGATGACGTTAAAAAAGCTATTATCGAAGGAACTAAAGCTTACTACGACGATCC^ 
AACGGTACTGCACTAACACAGGCAGAAGTAACAGAGCTAATTGAATACGCTAAATCTAAGGACATCGGTCTCATC 
CCAGCTATTAACAGTCCAGGTCACATGGATGCTATGCTGGTTGCCATGGAAAAATTAGGTATTAAAAATCCrC^ 

00 GCCCACTTTGATAAAGTTTC AAAA ACAACTATGGACTTGAAAAACGAAGAAGCGATGAACT^ 
ATCGGTAAATACATGGACTTCTTTGCAGGTAAAACAAAGATTITCAACTTTGGTACTGACG/^^ 
GCGACTAGTGCCCAAGGCTGGTACTACCrCAAGTGGTATCAACTCTATGGCAAATTTGCCGAATATGCCAACACC 
CTCGCAGCTATGGCCAAAGAAAGAGGGCTTCAACCAATGGCCITCAACGATGGCITCTACTATGAAGACAAGGAC 
GATGTTCAGTTTGACAAAGATGTCrrGATTTCTTACTGGTCTAAAGGCTGGTGGGGATATAACCT^^ 

JJ AATACCTAGCAAGCAAAGGCTATAAATTCTTGAATACCAACGGTGACTGGTACTACATTCTTGGTCAAAAACCAG 
AAGATGGTGGTGGTTTCCTCAAGAAAGCTATTGAGAATACTGGAAAAACACCATTCAATCAACrAGOT 
AATATCCTGAAGT AGATC TTCCAACAGTCGGAAGTATGCTTTCAATCTGGGCAGATAGACCAAGCGCTGAATACA 
AGGAAGAGGAAATCTTTGAACTCATGACTGCCTTTGCAGACCACAACAAAGACTACm 

CTCTCCGCGAAGAATTAGCTAAAATTCCTACAAACTTAGAAGGATATAGTAAAGAAAGTCTTGAGGCCCnTGArc 
00 CAGCTAAAACAGCTCTAAATTACAACCTCAACCGTAATAAACAAGCTGAGCTTGACACGCTTGTAGCCAACCTAA 
AAGCCGCTCTTCAAGGCCTCAAACCAGCTGTAACTCATTCAGGAAGCCTAGATGAAAATGAAGTGGCT^ 
TTGAAACCAGACCAGAACTCATCACAAGAACTGAAGAAATTCCATTTGAAGTTATCAAGAAAGAAAATCCTAACC 
TCCCAGCCGGTCAGGAAAATATTATCACAGCAGGAGTCAAAGGTGAACGAACTCATTACATCTCTGTACTCACTG 
AAAATGGAAAAACAACAGAAACAGTCCTTGATAGCCAGGTAACCAAAGAAGTTATAAACCAAGTGGTTGAAGTT 
OD GGCGCTCCTGTAACTCACAAGGGTGATGAAAGTGGTCTTGCACCAACTACTGAGGTAAAACCTAGACTGGATATC 
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CAAGAAGAAGAAATTCCATTTACCACAGTGACITGTGAAAATCCACTCTTACTCAAAGGAAA^ 
ACTAAGGGCGTCAATGGACATCGTAGCAACTTCTACTCTGTGAGCACTTCTGCCGATGGTAAGGAAGTGAA^ 
CTTGTAAATAGTGTCGTAGCACAGGAAGCCGTTACTCAAATAGTCGAAGTCGGAACTATGGTAACACATGTAGGC 
GATGAAAACGGACAAGCCGCTATTGCTGAAGAAAAACCAAAACTAGAAATCCCAAGCCAACCAGCTCCATCAAC 
5 TGCTCCTGCTGAGGAAAGCAAAGTTCTTCCTCAAGATCCAGCTCCTGTGGTAACAGAGAAAAAACTTCCT^ 

AGGAACTCACGATTCTGCAGGACTAGTAGTCGCAGGACTCATGTCCACACTAGCAGCCTATGGACTCACTAAAAG 
AAAAGAAGACTAA 

MrVIUINrrMQSGGFAMKHEKQQRFSIRKYAVGAASVLIGFAFQAQTVAADGVTPTTTENQPTIHTVSDSPQSSENRTE^ 

lU TPKAVLQPEAPKTVETETPATDKVASLPKTEEKPQEEVSSTPSDKAEVVTPTSAEKETANKKAEEASPKKEEAKEVDSKE 
SNTDKTDKDKPAKKDEAKAEADKPATEAGKERAATVNEKLAKKKIVSIDAGRKYFSPEQLKEIIDKAKHYGYTDLH 
VGNDGLRFMLDDMSITANGKTYASDDVKRAIEKGTNDYYNDPNGNHLTESQMTDLINYAKDKGIGLIPTVNSPGHMD 
AILNAMKELGIQNPNFSYFGKKSARTVDLDNEQAVAFTKAUDKYAAYFAKKTEIFNIGLDEYANDATDAKGWSVLQA 
DKYYPNEGYPVKGYEKFUYANDLARIVKSHGLKPMAFNDGIYYNSDTSFGSFDKDIIVSMWTGGWGGYDVASSKLLA 

15 EKGHQILm'NDAWYYVLGRNADGQGWYNLDQGLNGIK^^•PITSVPKTEGADIP^GGMVAAWADTPSARYSPSRLFKL 
MRHFANANAEYFAADYESAEQALNEVPKDLNRYTAESVTAVKEAEKAIRSLDSNLSRAQQDTIDQAIAKLQETVNNLT 
LTPEAQKEEEAKREVEKIAKmCVISIDAGRKYFTLNQLKRIVDKASELGYSDVHLLLGNDGLRFLLDDMTITANGKTYA 
SDDVKKAIIEGTKAYYDDPNGTALTQAEVTELIEYAKSKDIGLlPAINSPGHMDAMLVAMEKLGIK^a>QAHFDKVSKTT 
MDLKNEEAMNFVKALIGKYMDFFAGKTKIFNFGTDEYANDATSAQGWYYLKWYQLYGKFAEYANTLAAMAKERGL 

20 QPMAFNDGFYYEDKDDVQFDKDVLISYWSKGWGYNLASPQYLASKGYKFLNTNGDWYYILGQKPEDGGGFLKKAI 
E^r^GKTPFNQLASTKYPEVDLPTVGSMLSIWADRPSAEYKEEEIFELMTAFADHNmYFRANYNALREELAKIPTNLEG 
YSKESLEALDAAKTALNYNLNRNKQAELDTLVANLKAALQGLKPAVTHSGSLDENEVAANVETRPELITRTEEIPFEVI 
KKENPNLPAGQENirTAGVKGERTHYISVLTENGKTTETVLDSQVTKEVlNQVVEVGAPVTHKGDESGLAPTTEVKPRL 
DIQEEEIPFITVTCENPLLLKGKTQVnXGVNGHRSNFYSVSTSADGKEVKTLVNSVVAQEAVTQIVEVGTMVTHVGDE 

25 NGQAAIAEEKPKLEIPSQPAPSTAPAEESKVLPQDPAPVVTEKKLPETGTHDSAGLVVAGLMSTLAAYGLTKRKEDZ 

ID122 825bD 

ATGAACAAAAAAACAAGACAGACACTAATCGGACTGCTAGTGTTATTGCTTTTGTCTACAGGGAGCTAT^^ 
30 AAGCAGATGCCGTCGGCACCTAATAGTCCCAAAACCAATCTTAGTCAGAAAAAACAAGCGTCTGAAGCTCCTAGT 
CAAGC ATTGG CAGAGAGTGTCTTAACAGACGCAGTCAAGAGTCAAATAAAGGGGAGTCTGGAGTGGAATGGCTC 
AGGTGCTTTTATCGTCAATGGTAATAAAACAAATCTAGATGCCAAGGTTrCAAGTAAGCCCTA 
AACAAAGACAGTGGGCAAGGAAACTGTTCCAACCGTAGCTAATGCCCTCTTGTCTAAGGCCACTCGTCAGTACAA 
GAATCGTAAAGAAACTGGGAATGGTTCAACTTCTTGGACTCCTCCAGGTTGGCATCAGGTCAAGAATCTAAAGG^ 
35 CrCTTATACCCATGCAGTCGATAGAGGTCATTTGTTAGGCTATGCCTTAATCGGTGGTTTGGATGGTm 

CAACAAGCAATCCTAAAAACATTGCTGTTCAGACAGCCTGGGCAAATCAGGCACAAGCCGAGTATTCGACTGGTC 
AAAACTACTATGAAAGCAAGGTGCGTAAAGCCTTGGACCAAAACAAGCGTGTCCGTTACCGTGTAACCCm 
ACGCTTCAAACGAGGATTTAGTTCCCTCAGCTTCACAGATTGAAGCCAAGTCTTCGGATGGAGAATTGGAAT^ 
ATGTTCTAGTTCCCAATGTTCAAAAGGGACTTCAACTGGATTACCGAACTGGAGAAGTAACTGTAACTCAGT^ 



40 



45 



Ml^TRQTLIGLLVLlXl^GSYYKQMPSAPNSPKTM^QKKQASEAPSQALAESVLTDAVKSQIKGSLEW 
NG^nCTNLDAKVSSKPYAD^^CTO^GKETVPTVANALl^KATRQYKNRKETGNGSTSW^PPGWHQVKNLKGSYT^^ 
DRGHLLGYALIGGLDGFDASTSNPKNIAVQTAWANQAQAEYSTGQNYYESKVRKALDQNKRVRYRVTLYYASNEDL 
VPSASQIEAKSSDGELEFNVLVPNVQKGLQLDYRTGEVTVTQZ 

ID123 225bp 



GTGCTAAGATTCAGCGGA TTGAG GCAAGTGATGAAGATGAATAAGAAATCAAGCTACGTAGTCAAGCGTTTACTT 
TTAGTCATCATAGTACTGATTTTAGGTACTCTGGCTCTAGGAATCGGTTTAATGGTAGGTTATGG^^ 
50 AGGGTCAAGATCCATGGGCTATCCTGTCTCCAGCAAAATGGCAGGAATTGATTCATAAATTTACAGGAAATTAG 



VLRFSGLRQVMKMNKKSSYVVKRLLLVIIVULGTLALGIGLMVGYGILGKGQDPWAILSPAKWQELIHKFTGN2 
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CLAIMS: 

1. A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 1. 

5 

2. A Streptococcus pneumoniae protein or polypeptide having a sequence 
selected from those shown in table 2. 

3. A protem or polypeptide as claimed in claim 1 or claim 2 provided in 
10 substantially pure form. 

4. A protein or polypeptide which is substantially identical to one defined in any 
one of claims 1 to 3. 

15 5. A homologue or derivative of a protein or polypeptide as defined in any one of 
claims 1 to 4. 

6. An antigenic and/or immunogenic fragment of a protein or polypeptide as 
defined 

20 in Tables 1-3. 

7. A nucleic acid molecule comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

25 

(ii) a sequence which is complementary to any of the sequences of (i); 
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(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i). (ii) 
5 and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

10 8. A nucleic acid molecule comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

15 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which is substantially identical with any of those of (i), (ii) 
20 and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 2. 

25 9, The use of a protein or polypeptide having a sequence selected from those 
shown in Tables 1-3, or homologues, derivatives and/or fragments thereof, as an 
immunogen and/or antigen. 
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10. An immunogenic and/or antigenic composition comprising one or more 
proteins or polypeptides selected from those whose sequences are shown in Tables 1- 
3, or homologues or derivatives thereof, and/or fragments of any of these. 

5 11. An immunogenic and/or antigenic composition as claimed in claim 10 which 
is a vaccine or is for use in a diagnostic assay. 

12. A vaccine as claimed in claim 11 which comprises one or more additional 
components selected from excipients, diluents, adjuvants or the like. 

10 

13. A vaccine composition comprising one or more nucleic acid sequences as 
defined in Tables 1-3. 

14. A method for the detection/diagnosis of S.pneumoniae which comprises the 
15 step of bringing into contact a sample to be tested with at least one protein or 

polypeptide as defined in Tables 1-3, or homologue, derivative or fragment thereof. 

15. An antibody capable of binding to a protein or polypeptide as defined in 
Tables 1-3, or for a homologue, derivative or fragment thereof. 

20 

16. An antibody as defined in claim 15 which is a monoclonal antibody. 

17. A method for the detection/diagnosis of S.pneumoniae which comprises the step 
of bringing into contact a sample to be tested and at least one antibody as define din 

25 claim 15 or claim 16. 

18. A method for the detection/diagnosis of S.pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
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sequence as defined in claim 7 or claim 8. 

19. A method of determining whether a protein or polypeptide as defined in 
Tables 1-3 represents a potential anti-microbial target which comprises inactivating 
said protein or polypeptide and determining whether S.pneumoniae is still viable. 

20. The use of an agent capable of antagonising, inhibiting or otherwise 
interfering with the function or expression of a protein or polypeptide as defined in 
Tables 1-3 in the manufacture of a medicament for use in the treatment or 
prophylaxis of S, pneumoniae infection 
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